Thursday, September 17, 2009

Week 4 reading notes

Database:
(http://en.wikipedia.org/wiki/Database)

Most of these things I didn’t know previously, so this will be mostly notes with a couple thoughts here and there. Notes in quotations are taken from the Wikipedia article above.

A database is “an integrated collection of logically related records or files consolidated into a common pool that provides data for many applications. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images.”

The data in a database is organized according to a database model, the most common one being the relational model.

Architecture:
On-line Transaction Processing systems (OLTP) use “row oriented” datastore architecture, while data-warehouse and other retrieval-focused applications or bibliographic database (library catalogue) systems may use a column-oriented DBMS (database management system) architecture.

Database management systems:
A DBMS system is software that organizes storage of data, controlling “the creation, maintenance, and use of the database storage structures of an organization and its end users.”

DBMS has five main components:
- Interface drivers: provide methods to prepare and execute statements, get results, etc.
- SQL engine (comprises the three major components below)
- Transaction engine
- Relational engine
- Storage engine

ODBMS has four main components:
(I’m assuming the O stands for Online? The article doesn’t say.)
-Language drivers
-Query engine
-Transaction engine
-Storage engine

Primary tasks of DBMS packages include:
-Database Development: defines and organizes the content, relationships, and structure of the data needed to build a database.
-Database Interrogation: accesses the data in a database for information retrieval. Users can selectively retrieve and display information and produce printed documents.
-Database Maintenance: used to “add, delete, update, correct, and protect the data in a database.”
-Application Development: used to “develop prototypes of data entry screens, queries, forms, reports, tables, and labels for a prototyped application.”

Types of databases:
-Operational
-Analytical
-Data
-Distributed
-End-user
-External
-Hypermedia
-Navigational
-In-memory
-Document-oriented
-Real-time

All databases take advantage of indexing to increase speed. “The most common kind of index is a sorted list of the contents of some particular table column, with pointers to the row associated with the value.”

Database software should enforce the ACID rules:
-Atomicity
-Consistency
-Isolation
-Durability
Many DBMS’s relax a lot of these rules for better performance.

Security is enforced through access control, auditing, and encryption.

“Databases are used in many applications, spanning virtually the entire range of computer software. Databases are the preferred method of storage for large multiuser applications, where coordination between many users is needed.”

My notes: There were a few terms mentioned in the article that were never explained or linked to other articles: for example, SQL, ODBMS, and RDBMS (what the O and R stand for). Other than that it was a decent introduction to the concept of DBMS’s and how they work.

~&~

Anne J. Gilliland. Introduction to Metadata, pathways to Digital Information: 1: Setting the Stage
(http://www.getty.edu/research/conducting_research/standards/intrometadata/setting.html)

Again, all quotes are directly from the article:

Metadata means “data about data”.

“Until the mid-1990s…. metadata referred to a suite of industry or disciplinary standards as well as additional internal and external documentation and other data necessary for the identification, representation, interoperability, technical management, performance, and use of data contained in an information system.”

“In general, all information objects, regardless of the physical or intellectual form they take, have three features…. all of which can and should be reflected through metadata:
-Content relates to what the object contains or is about and is intrinsic to an information object.
-Context indicates the who, what, why, where, and how aspects associated with the object's creation and is extrinsic to an information object.
-Structure relates to the formal set of associations within or among individual information objects and can be intrinsic or extrinsic or both.”

“Library metadata development has been first and foremost about providing intellectual and physical access to collection materials. Library metadata includes indexes, abstracts, and bibliographic records created according to cataloging rules (data content standards).”

“In an environment where a user can gain unmediated access to information objects over a network, metadata
-certifies the authenticity and degree of completeness of the content;
-establishes and documents the context of the content;
-identifies and exploits the structural relationships that exist within and between information objects;
-provides a range of intellectual access points for an increasingly diverse range of users; and
-provides some of the information that an information professional might have provided in a traditional, in-person reference or research setting.”

“Repositories also create metadata relating to the administration, accessioning, preservation, and use of collections…. Integrated information resources such as virtual museums, digital libraries, and archival information systems include digital versions of actual collection content (sometimes referred to as digital surrogates), as well as descriptions of that content (i.e., descriptive metadata, in a variety of formats).”

“Metadata not only identifies and describes an information object; it also documents how that object behaves, its function and use, its relationship to other information objects, and how it should be and has been managed over time.”

Different Types of Metadata…
-Administrative
-Descriptive
-Preservation
-Technical
-Use

Primary Functions of Metadata…
-Creation, multiversioning, reuse, and recontextualization of information objects
-Organization and description
-Validation
-Utilization and preservation
-Disposition

Some Little-Known Facts about Metadata…
-Doesn’t have to be digital
-Is more than the description of an object
-Comes from a variety of sources
-Accumulates during the life of an information object or system
-One information object's metadata can simultaneously be another’s data, depending on aggregations of and dependencies between information objects and systems

Why Is Metadata Important?
-Increased accessibility
-Retention of context
-Expanding use
-Learning metadata
-System development and enhancement
-Multiversioning
-Legal issues
-Preservation and persistence

“Metadata provides us with the Rosetta stone that will make it possible to decode information objects and their transformation into knowledge in the cultural heritage information systems of the future.”

My notes: It took me a while to get through this article. The language was relatively easy to understand, but there was a lot of fact-stating and not a lot of examples, which are generally helpful to me in understanding a subject. I did like how she organized a lot of the facts about metadata into tables, which I’ve organized into short lists here. Presenting the information that way was an effective way to get a lot of information across without seeming bogged-down.

~&~

Eric J. Miller. An Overview of the Dublin Core Data Model
(http://dublincore.org/1999/06/06-overview/)

“The Dublin Core Metadata Initiative (DCMI) is a international effort designed to foster consensus across disciplines for the discovery-oriented description of diverse resources in an electronic environment…. The requirement of providing the means for a modular, extensible, metadata architecture to address local or discipline-specific descriptive needs has been identified since the very beginning of the DCMI work [WF]. The formalized representation of this requirement has been the basis for the Dublin Core Data Model activity.”

DCMI Requirements…
-Internationalization
-Modularization/Extensibility
-Element Identity
-Semantic Refinement
-Identification of encoding schemes
-Specification of controlled vocabularies
-Identification of structured compound values

The Basic Dublin Core Data Model…
-There are resources in the world that we would like to describe. These resources have properties associated with them. The values of these properties can be literals (e.g. string-values) or other resources.
-A resource can be anything that can be uniquely identified.
-Properties are specific types of resources.
-Classes of objects are specific types of resources.
-Literals are terminal resources. (Literals are simple text strings).

My notes: I’m not really sure what to say about this article. It states that it’s an overview and a work in progress, but it’s dated from 1999, so I’m kind of curious to see what their status is now. With all the advancements in technology over the past ten years, I wonder if their model or any of their requirements have changed since then.

6 comments:

  1. I am not sure about what Dublin Core Data model mean. I think, we can understand it more if we use it in an assignment because we will understand how its work and how we can benefit from it.

    ReplyDelete
  2. I agree with your observation on Gilliland's use of tables in arranging the categories of Metadata, very helpful. I especially appreciated her inclusion of "Little know Facts."
    I am particularly interested in seeing the implications Metadata will have for institutionally based repositories, with emphasis on those that allow for the inclusion of open source publishing.

    ReplyDelete
  3. I also am intrigued with how the Dublin Core is getting on 10 years later. It is probably finished by now. I wonder if new tech has allowed for the project to be more workable for the non computer programmer community.

    ReplyDelete
  4. I just went to the website of the Dublin Core Metadata initiative and find the "about" article a little clearer than the article we read. What intrigues me is the international effort in this.
    http://dublincore.org/about/

    ReplyDelete
  5. I liked that you broke up a bit of that frightening Dublin Core Data model into bullet points... I should go back and read it that way - it seems to make it easier to digest!
    I was also struck by its age. A lot of technical stuff like that ten years on is often easier for laymen to understand. I'll have to Google it and see if there is any newer information.

    ReplyDelete
  6. Thank you for really tackling the Dublin Core. That was this week's messiest reading for me and I had a hard time really breaking it down. I know it's something I'll be looking back on later and I am going to bookmark your entry.

    ReplyDelete