Printable Version of this PageHome PageRecent ChangesSearchSign In

OCLC Reclamation

Purpose:
A two-way synchronize of the WorldCat OCLC holdings with the Aleph local library database.
Current status:
OCLC has completed their processing, FCLA and USF are analyzing exception files.
Benefits:
  • More up-to-date indication of your Holdings in OCLC, which can improve ILL service
  • Improved merge of Aleph records with other SULs to be ingested into Mango/Endeca.
  • More accurate counts of holdings in the WorldCat Analysis project.
  • Remove obsolete OCLC codes.

Getting started on your reclamation information for the other SULs as they contemplate a reclamation.

University of South Florida Reclamation Project Summary
Report by
Daniel Cromwell, LMS Field Specialist, FCLA
3/4/09

Brief summary:
In August of 2008, USF contacted FCLA to inquire about the best way to extract and deliver their database records to OCLC for a reclamation project. Because of the complexities of extracting such a large amount of BIBs while preserving appropriate holdings data, it was decided that the extraction and delivery of the records should be performed by FCLA staff. After negotiating specifications for the extract with USF staff and clarifying the project scope and other procedural and technical aspects with OCLC, records were extracted on 12/08/2008 and delivered to OCLC. The extract consisted of 1,806,810 BIB records which were transferred to OCLC in 21 files each containing up to 90,000 records. OCLC finished processing all 21 files by 01/23/2009.

Overview of process:
1. Contact OCLC, review their documents, submit a Batch services request (USF) – Aug 2008
2. Analyze bib. and holdings data in Aleph, answering questions below (USF and FCLA) – Aug - Nov 2008
3. Work out process to extract records with FCLA (FCLA) - Oct-Nov 2008
4. Send test file to OCLC (FCLA) - Oct 2008
5. Create a specifications document (FCLA and USF) Sept – Oct 2008
6. Library and OCLC sign off on specifications - Nov 2008
7. Extract files and send to OCLC (FCLA) - Dec 2008
8. OCLC analyzes records and asks questions - Jan 2008
9. OCLC processes all records, returns output files (OCLC) - Jan 2008
10. Analyze output and decide on actions to be taken (USF and FCLA) Jan 2009 – ongoing

The numbers:
Below is the totals line from the OCLC overall summary spreadsheet as received after OCLC completed processing the files. The column “Records” is the total number of BIBs sent and processed. The column “Holdings Set” is the total number of holdings set for the “Group” project, meaning one record could have several different OCLC holdings symbols set. The column “Unresolved” are BIB records that did not match and therefore did not get any holdings set.

RecordsHoldings SetUnresolved
Totals:1,806,8101,829,229139,409

NOTE: From counts of document records in ORACLE, a rough estimate of the total number of BIBs in USF’s database is around 1,941,560. About 102,984 EEBO records were not sent in the extract. From the estimate of the total number of records, the 1,806,810 is roughly 93 % of the database. Adding the EEBO records to those extracted and sent to OCLC accounts for over 98 % of the USF database. The remaining records not sent consist of ACQ-CREATED, Provisional, CIRC-CREATED records.
The total number of BIBs that had one or more holdings set is the “Records” minus the “Unresolved” which means 1,667,401 records had holdings set during the project.
OCLC also sent what they called “cross-reference” files for each of the original 21 files processed. These files each consisted of a listing of “local system number/OCLC number” pairs for each matched record. Comparing these files against the local USF database we were able to obtain more data on whether the OCLC number returned after the processing was existing or new/changed. The new/changed numbers will be loaded to the local database records to add in the new OCLC number or replace defunct existing OCLC numbers. So in addition to holdings being rectified with OCLC, local records are improved by gaining OCLC numbers or correcting numbers that have changed since original download.
Records Existing Single Match New Existing Mult Match
Totals: 1,667,401 1,435,933 192,023 39,445

Looking at unresolved records:
In addition to the summary spreadsheet and the cross-reference files, OCLC also posted files of MARC records for the 139,409 total unresolved records. These files are intended to be downloaded and handled by local staff for ultimate resolution. Although this number sounds quite large at first, a preliminary spot checking of the records reveal that many are either honors theses or within the “Eighteenth Century Collections Online ” collection of purchased e-books. These records would not be expected to be in OCLC’s database. The numbers break down as follows:

Honors thesis = 1,750
ECCO = 136,957
Total = 138,707

If we can assume that none of these records exist in OCLC’s database that is already 99 % of the unresolved records accounted for. It’s a bit surprising that just two classes of records would account for such a large portion of the unresolved. Much closer analysis still needs to be done to verify what the situation actually is for the unresolved records. It is clear however that a project to add USF honors theses to OCLC should be done as well as USF, OCLC, and ECCO vendor talks to get those records in WorldCat. It looks like OCLC and Gale are already negotiating concerning the ECCO collection http://www.oclc.org/us/en/worldcatlocal/support/vendor.htm. Other records that don’t fall into either the honors theses or the ECCO records are candidates for manual processing and therefore should be isolated out of the unresolved. Ongoing work is currently being done to achieve this goal.


Last modified 12 March 2009 at 3:08 pm by jeanp