Endeca Overview
Endeca Overview in outline form:
- Data processing/merging
- BIB and HOL data extracted from Aleph Oracle (z00) x 11
- Merge routine
- 'Endeca Field Mapping and Pipeline' shows the action that is taken during the merge routine for each MARC field
- Deduping on OCLC no
- HOL data written into the Union MARC
- p_print_03 for formats and UTF8 encoding
- The merge routine happens before the field mapping.
- QF data and other data sources in the future (digital libraries)
- Endeca Forge and Dgidx
- Forge is a data processing program
- Endeca provided a custom MARC adapter to make MARC records readable by Forge
- Transforms your source data into standardized, tagged Endeca records
- Each record has a list of dimension (text) values tagged to it.
- Dgidx is an indexing program
- Reads the tagged Endeca records that were prepared by Forge
- Creates the proprietary indices for the Endeca MDEX Engine
- Dgraph: An Index for every N-value
- Entire Endeca Database stored in memory
- Indexing: program that analyzes and extracts indexable tokens from text in records
- Output stored in directories the file system (fclnx19).
- Indexing Configuration in Endeca (pipeline) includes:
- Stop words
- Character normalization and ‘internationalization’
- Thesaurus and Stemming (automatic)
- Taxonomies ('hierarchies') e.g. LCC/NLM
- Search Configuration ('interfaces')
- Relevance ranking
- DYM and Spell Correction
- Truncation (not implemented)
- MDEX Engine serves data in response to a query, includes all of the information needed to build an entire page
- Queries include Search and Navigation
- An entire page is returned in response to a query, constructed from a subset of the Dgraph
- Subsequent navigation is applied to this object, not the entire Dgraph.
- Documents – See https://sblogs.fcla.edu/index.php/endeca
- 'Endeca Field Mapping and Pipeline' shows how MARC fields are mapped to Endeca record fields.
- 'Endeca Dimensions (Facet) Mappings' shows how MARC fields are mapped to Endeca dimensions (facets). —'Endeca Search Configuration' shows the search 'interfaces' and the Endeca record fields that are searched
- WebApp by FCLA
- Apache Tomcat and JSP
- maintains a connection state with Endeca Nav Engine
- similarly, maintains a connection with ORACLE via JDBC
- restarted every morning
- Endeca custom API includes
- Http Connections into the MDEX
- Method to query via a URL
- An result object that can be parsed and manipulated for display
- A method to get the dimensions, dimension values, and corresponding IDs for a particular Navigation state.
- Boolean query mode
- interaction with other features (no stop words, stemming, spelling, thesaurus, ranking)
- proximity searching (NEAR/n, ONEAR/n)
- Statistical Report (See http://www.fcla.edu/FCLAinfo/stats/endeca_stat/endeca_stat.html and linked document that explains the reporting categories)
- Hooks into Aleph
- Availability Circ (item) status
- Patron Empowerment
- Loans list
- Renewals
- Requests and Holds
- SFX Contextual links for Full Text
- Custom Features
- List Functions
- RSS
- Hooks to RefWorks
- Permalink
- Marc Views via SQL
- Debug View
- Book covers
- Development Environment
- subversion
- svnlog (http://catalog.fcla.edu/svnlog.xml)
- Circulation function of Webapp
- Availability check
- SQL query to Aleph
- Checks if there is a due date in z36
- Checks the z36_status (L or C)
- Checks the z30 if not loaned
- Gets info for item from local copy of tab15 (tabs loaded every Sun. via crontab)
- Checks item status 90’s OR col. 6=N to display tab15 col. 5 text.
- If col. 6=Y
- Checks for reserve label (day loan/hour loan) and display tab15.col6.text
- Check if there is an IPS and if so, displays IPS text
- In all other cases “Available” will display
- Item Requests
- UF gets label “Request” all others get “Place a Hold”
- Calls cgi program that interfaces with Aleph API for circ
- Nothing is displayed cgi returns an error
- Troubleshooting
- Try to place a request in the circ client, and note any error messages and codes
- tab_hold_request : CIRC only
- note check_hold_request code in header
- $aleph_error_eng : grep for code.
- Display of item information
- tab40.ene and tab_sub_library.ene are extracted daily via crontab
- Library/Collection Facet
- File ‘sublib_display’ generated from all tab_sub_library's/tab40’s
- ‘sublib_display’ used to generate facet values during forge
- Brief Record
- ‘sublib_display’ used to map Library/collection text (can update w/o forge)
- Full Record
- Item information comes from SQL query into Aleph
- ‘sublib_display’ used to map Library/collection text (summary hol and detailed hol, can update w/o forge)