Prof. Edward A. Fox, Dept. of Computer Science, Virginia Tech, Blacksburg.
This tutorial will provide a thorough and deep introduction to the DL field, introducing and building upon a firm theoretical foundation (starting with "5S": Streams, Structures, Spaces, Scenarios, Societies), giving careful definitions and explanations of all the key parts of a "minimal digital library", and expanding from that basis to cover key DL issues, illustrated with a well-chosen set of case studies. Attendees will receive a partial draft copy of a new book under development, with tentative title "Foundations for Information Systems: Digital Libraries and the 5S Framework".
Goals are to:
Motivation: Why do we need DLs (Goals, objectives)? What are DLs? How do DLs work? Why do we need this book? Why 5S? History: Memex (Bush), Licklider, Web... Related Areas: LIS (Bibliometrics), probability/statistics (distribution, e.g., Zipf), linguistics, AI, databases. Knowledge management, content management, ...; Context. Running examples: Institutional repositories, Archaeological info worldwide. "Other people's" Definitions.
Text: Character strings and coding (Unicode); Morphology -> Stemming; Syntax, semantics -> stop words; Stemming, stopping; Multilingual issues. Images: Processing and Analysis. Audio; Video. Integrating streams: Synchronization, Rendering, ...
Retrieval Models: General issues: natural vs. query
languages. Boolean: Extended Boolean. Vector: LSI. Probabilistic: Classical; Belief
Network, inference network; Language Models.
User interfaces and Visualization: Taxonomy of UI
components - by layout, location, shape; CitiViz
Information Needs/Access: Searching/Discovery (Ad-hoc, Filtering), Browsing (HT, InfoViz, Organizational scheme), Feedback, (Thin/thick client), Workflow. Scenario-Based Design. Usability: Environments for Workflow: DLITE; Tasks, claims, goals. Logging (to capture behavior/identifying sessions by transactions)
User Communities: Authors, editors, teachers. Readers,
students, researchers. Accessibility, universal access, handicap
Librarians: Reference, acquisition, operations.
Research Community: Associations, conferences.
Publications. Laboratories and projects.
Social issues: Cooperation, collaboration: Acceptance,
adoption (personal, organizational). Sharing info (annotation, ratings). Social networks.
Digital divide. Cultural heritage and preservation: Museums. Internationalization.
Economical issues: Security: Authorization,
Authentication, Watermarks. Legal issues - terms and conditions: Patents, trademarks,
Copyright, Intellectual Property Rights, Digital Rights Management. Publishers, Eprints,
Self-Archiving, Cataloguing costs, Open Collections. Sustainability. Open source,
commercial, hybrid solutions. E-commerce.
Sets, Groups. Terminology. Packages, Granularity: METS.
Collection Development policies: Coverage, breadth , Acquisition, Removal and retiring
policies: Traditional vs. DLs.
Large and Distributed Collections: Efficiency/Effectiveness. Scale: Large Objects
(granularity, stream splitting, replication, compression); Intelligence/processing
granularity: object, cluster, collection, repository. Parallelism and Distribution:
Federation vs. Harvesting.
Cataloguing (as a process): Costs, Sharing, AACR2. Manual vs. (Semi-)Automatic. Distributed vs. centralized. OPACs: Worldcat (OCLC), ... Coverage, breadth. Specificity, depth. Management: versioning, works, multiple representations. Storage: Bucket model.
Naming, Identifiers. Types: Institutional, personal, genre-specific,
aggregate, ...
Architectures, Interoperability: Federating: Selecting sites, parallel search (fall-back),
fusion/merging of results; Z39.50 (CIMI), SRU/SRW, Dienst. Harvesting: Harvest (the system),
OAI.
Preservation, Archive: Replication(LOCKS), emulation, migration, hybrid schemes:
UVC (Lorie). Institutions: DLF, Library of Congress, National Archives. People: Besser,
Gladney. Standards: OAIS.
Scalability, Storage. OpenURLs (ExLibris). Institutional Repositories (in depth).
Taxonomy of Services: Ontology, Composition, reuse. Creational: Crawling. Preservational. Value-Added: Indexing; Logging; Clustering; Classifying. Info Satisfaction Services: Recommending, Social networks, Portals.
Architectures: Internet middleware; P2P, Grid, Service-Oriented, Client-server, Agents, clusters (simulation - Paul's). System descriptions and comparison (Greenstone, Fedora, Eprints, Dspace, Kepler, Phronesis, DLI spin-offs, VITAL, IBM Content Management). VT: ODL and DL-in-a-Box, MARIAN, 5S Suite.
Copyright ECDL - Design by JASBAT