Introduction
Anyone who can understand English and has a modest knowledge of journal editing and publishing can produce Journal Article Tag Suite (JATS) extensible markup language (XML) files. Manually coding such documents according to a document type definition (DTD) requires 12 hours for a single article, even if any existing extensible hypertext markup language (XHTML), mathematical markup language (MathML), and chemical markup language (ChemML) portions are treated as figure formats. Therefore, most printing companies employ a special XML conversion program. In general, the preliminary XML file for a single article is generated within 15 minutes. Then, the validation check and trimming of the file is performed. XML coding of articles requires some knowledge of bibliographic formatting conventions in order to differentiate the bibliographic characteristics of data in the articles. For example, in the reference section, journals, books, web sites, or PhD theses may be cited. Typically, these distinct reference types should be formatted differently. The objective of this article is to explain how editors can code journal articles according to JATS 1.0 XML specification and to observe the results through a web browser. In particular, at least 20 XML tags are considered and the role of DTDs, extensible stylesheet language transformations (XSLT), and cascading style sheets (CSS) are explained. It is expected that after gaining some experience with the article coding process, editors will have an incentive to adopt JATS XML.
Programs and Sample File
An editor program that supports Unicode and a web browser are necessary for coding JATS XML file. JATS XML sample files are provided as Supplement 1. Further, the DTD, XSLT, and CSS files are available as well: a journalpublishing1.dtd file was available from https://github.com/PeerJ/jats-conversion/blob/master/schema/jats/publishing/1.0/JATS-journalpublishing1.dtd; a jats-html.xsl file was available from: https://github.com/ncbi/JATSPreviewStylesheets/blob/master/xslt/main/jats-html.xsl; and a jats-preview.css file was available from: https://github.com/wendellpiez/oXygenJATSframework/blob/master/jats-preview-xslt/jats-preview.css.
Coding an XML File and Browsing Using DTD, XSLT, and CSS
First, open the sample coding article within your text editor. Save the file (sample.xml) to a specific directory, ensuring that UTF-8 encoding is maintained. Open the file in a web browser and observe how it appears. It is presented as Fig. 1. Next, copy the DTD file (journalpublishing1.dtd) into the same directory where the sample file is located. Re-open the sample file in a browser and observe how its appearance has changed. To the same directory, add the XSLT file (jats-html.xsl) and then the CSS file (jats-preview.css). Each time re-open sample.xml in a browser and observe any changes (Figs. 2, 3). With each additional file, you should note any improvements in the layout and appearance of the document, and these changes should suggest the function of each file added.
Since a DTD provides the attributes and elements only, the format of the sample file as viewed in the browser does not change (Fig. 1). XSLT files, in contrast, typically define how an XML document is to be rendered or transformed into hypertext markup language (HTML) for example (Fig. 2). Finally, CSS files describe the look and formatting of documents written in mark-up languages such as HTML (Fig. 3).
How to Declare an XML Document?
An XML document begins as follows:
<?xml version=“1.0” encoding=“UTF-8”?>---(1)
<?xml-stylesheet type=“text/xsl” href=“jats-html.xsl”?> ---(2)
<!DOCTYPE article PUBLIC “-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN” “http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd”>---(3)
(1) <?xml version=“1.0” encoding=“UTF-8”?> means that the xml version is 1.0 and encoding is according to UTF-8. It is the most common declaration for XML documents.
(2) The<?xml-stylesheet type=“text/xsl” href=“jats-html.xsl”?> tag determines the stylesheet. Layout is assigned in jats-html.xsl.
(3) The declaration on line three beginning with ‘<!DOCTYPE article’ indicates the online location of the DTD file. If the DTD file is instead located on a local file system, the declaration would appear as <!DOCTYPE article PUBLIC “-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 “JATS-journalpublishing1.dtd”>
The DTD determines the elements and attributes that are permissible within any XML document that refers to it. Previously, different DTDs were used by each publisher; however, the JATS XML DTD is now conventionally used.
Article Declaration
<article
article-type=“research-article” --- (4)
dtd-version=“1.0” xml:lang=“en” --- (5)
xmlns:mml=http://www.w3.org/1998/Math/MathML--- (6)
xmlns:xlink=http://www.w3.org/1999/xlink --- (7)
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance --- (8)
>
Above is an example of an article declaration.
(4) article-type=“research-article” specifies that the publication type is “research article.” A variety of types are available, such as “editorial,” “letter,” and “case report.”
(5) dtd-version=“1.0” xml:lang=“en” declares the language in which the article is written. If there is no language declaration, the default value is “en” (English).
(6-8) The last three lines in the article declaration above indicate that this article follows W3C MathML, XLINK, and XML Schema-instance.
Tag Construction
Articles are comprised of three sections classified as “front matter,” “body,” and “back matter.” Each of these sections is enclosed within the corresponding tag pair as follows:
<front> --- (9)
....
</front>
<body> --- (10)
...
</body>
<back>--(11)
...
</back>
(9) “Front matter” consists of citation details, an abstract, keywords, and masthead.
(10) The “body” section includes the article’s primary content from the introduction to conclusion.
(11) “Back matter” comprises conflict of interest, acknowledgments, footnotes, References, appendices, and/or supplements.
Special Characters
Special characters such as *, <, and > must be specified using the corresponding Unicode character entity reference, for example:
* *
< <
> >
All Unicode character entity References can be specified with a four-digit hexadecimal code, prepended with &#x and appended with a semicolon. For example, the code for the % character is 0025; therefore, the complete character entity reference is %. A full list of codes for all special characters is available at http://www.unicode.org/charts/.
Use of Local Language
To make a full-text JATS XML file for articles in Croatian, it is necessary to add a language tag such as <xml:lang=“hr”> in the article declaration. The language element is specified as an attribute with a two-letter alphabetic code in accordance with the IETF RFC 5646 (http://tools.ietf.org/html/rfc5646) recommended by the Internet Engineering Task Force in September 2009. For example, “fr” (French), “en” (English), “de” (German), “se” (Swedish), “hr” (Croatian), “es” (Spanish), and “ko” (Korean) were used [1].
Tagging Practice
After adding the journal article’s content to the appropriate sections of the sample file, you can then check how it appears in a web browser.
Validation of JATS XML
Once the JATS XML file has been produced, it can be validated with any variety of tools available online, such as http://www.ncbi.nlm.nih.gov/pmc/tools/xmlchecker/ or http://www.e-sciencecentral.org/tools/stylechecker/. Any indicated errors should be fixed in accordance with the JATS DTD.
Why Is It Necessary to Establish a JATS XML Producing Company That Deals with Each Language?
There is still small portion of full-text JATS XML-based society-directed journals in the web. A number of scientific journal publishing societies in Korea have begun to produce full-text JATS XML files and deposit them to ScienceCentral, since at least three Korean firms can generate perfect JATS XML files with table XHTML, ChemML, and MathML [2]. It is the base of producing JATS XML files both in English and in Korean. There are other excellent global companies that can produce JATS XML files; however, they usually process articles in English. Therefore, JATS XML-producing company of which specialists fluent in each language is needed in order to make articles accessible internationally via the Web and to deposit them to ScienceCentral for wider exposure.
Conclusion
JATS XML coding can be performed according to the JATS DTD, which provides the specification for elements and attributes. XSLT provides the stylesheet for XML; whereas CSS provides the stylesheet for HTML. With DTD, XSLT, and CSS, the JATS XML file can be viewed in a user-friendly fashion via a web browser. Journals in local languages can be specified in JATS XML files with the appropriate code in the language attribute. Such journals will then be accessible to all readers in the world with a variety of formats including PubReader and ePub.