What’s a Markup Language? ¨ Example: HTML–Hypertext Markup Language ¨ It’s just a text file… ¨ …which makes it easy to transfer on the Web. n It has a variety of functions, such as…
data and digital objects. n It is a wrapper that goes around digital information – text, images, video. n XML can encode metadata… n …but also can define the features of a document (e.g. TOC, formatting) n XML is a way to describe document structure – like the structure of a book, for example.
brackets, just like HTML. ¨ HTML example file: <html> <head> <title>This is My Web Page</title> </head> <body background=“#FFFFFF”> <p>Hello, World! </body> </html>
¨ XML is a metalanguage n HTML describes a web page ¨ XML describes all manner of “documents” n That is, HTML is fixed, limited and informal ¨ XML is versatile, multifaceted and formal
come in sets <strong>This is some bold text</strong> <ol><li>One Item</li> <li>Second Item</li></ol> n Individual tags must have a terminator <br /> n Tags must be nested – cannot overlap <strong><em>Invalid</strong></em>
Definition ¨ DTD defines how the document is structured, that is, allowable tags and grammar ¨ Sets rules for the document, such as: A <p> is part of a <chapter> which is part of a <book> -- but don’t allow a <p> in a <toc> n Schemas – A Restriction of DTD ¨ Can use multiple schemas with a given DTD n Rigorous Grammar = Machine Readable ¨ Platform independent…software independent ¨ If you know the DTD, you can write software to read that type of XML file. ¨ Correctly formatted XML can be parsed.
the XML specification ¨ World Wide Web Consortium-Cambridge, MA ¨ http://www.w3.org/XML/Core/#Publications n Anyone can use the standard – no fees n Only the W3C can maintain and update n W3C maintains many web standards… …such as: HTML, XHTML, CSS, PNG
the grammar… n …which means that XML can contain n Text, Graphics, Video … and so on. n Many new languages appearing that are based on XML. ¨ Such as….
MetaL: Meta Programming Language n MML: Music Markup Language n XBRL: Extensible Business Reporting Language n MathML: Mathematical Markup Language n OML: Weather Observation Definition Format n Adex: Newspaper Classified Ads Format n AML: Astronomical Markup Language n rezML: Resume and Job Listing Markup Lang.
common platform for electronic delivery of data n The Swiss Army Knife of file formats n Simpler than SGML ¨ XML is actually a simplified subset of SGML ¨ Standard Generalized Markup Language ¨ SGML & XML were both initially intended to facilitate large-scale electronic publishing
to parse… and therefore… ¨ …easier to build software ¨ SGML systems are complex & expensive ¨ XML-based systems are much easier to build n …easier to transmit on the Internet. n Greater degree of flexibility… …with less complicated grammar.
formatted documents can be mechanically validated for correctness n Validation ensures proper structure… …does not ensure correct content n All XML-based languages can be validated n XHTML @ http://validator.w3.org/
StyleSheets in HTML n Defines the look of an XML document n …that is, how individual tags are presented in, say, a browser or software n Multiple stylesheets for multiple uses (i.e. print, on-screen, etc.)
n RDF Site Summary n A way to provide headlines and content through a method of syndication n Exciting new format being used… n …by the press and by individuals (e.g. blogs) n You can “subscribe” to an RSS news feed.
n SharpReader, Syndirella, Radio Userland n Common: 3-pane window (like email) n Also: some use a web-based reader n The reader automatically updates the feeds on a regular basis. n Full text messages vs. Summaries
XML-based language n Profusion of versions and formats ¨ 7 different versions ¨ And 2 significantly different formats ¨ A problem with non-proprietary standards n RDF – Resource Description Framework
n Encoded Archival Description (EAD) – uses SGML – shift to XML http://www.loc.gov/ead/ n XML: the new standard n Interoperability – less likely obsolescence
http://www.loc.gov/standards/mets/ n Dublin Core XML Schemas http://www.dublincore.org/schemas/xmls/ n Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) -- a schema for MARC records in XML http://www.openarchives.org/OAI/2.0/guidelines- oai_marc.htm n RDF – Dublin Core, Open Directory and General Purpose Catalogs http://www.w3.org/RDF/#gen-col
an SGML encoding scheme that is maximally expressive and minimally obsolescent http://www.tei-c.org/ n HPSS: High Performance Storage System http://www.sdsc.edu/hpss/ n ADSM n The Question: Is XML an Archival Format?