The XML 1.0 Specification prohibits a DOCTYPE definition with a Public ID and no System ID.
Converters
- 1. Introduction
- 2. Standard Converters
- 2.1. Configuration
- 2.2. XML Converter
- 2.3. HTML Converter
- 2.4. Text Converter
- 3. To-XML Converter
- 4. XSL-FO Converter
- 5. XLS Converters
- 5.1. Preparing the Spreadsheet
- 5.2. To XLS Converter
- 5.3. From XLS Converter
1. Introduction
Converters are processors converting XML documents from one format to another. For example, the standard HTML converter documented below converts an XML document into an HTML document. This HTML document can then be sent to a web browser using the HTTP serializer, or attached to an email with the Email processor.
Converters typically have a data
output containing the converted
document.
2. Standard Converters
The standard converters convert XML infosets (the XML documents that circulate in Orbeon Forms pipelines) into text according to standard output methods defined by the XSLT specification. They convert to the following formats:
- XML: a standard XML document
- HTML: a standard HTML document
- XHTML: a standard XHTML document
- Text: any text document
The resulting text is sent to the data
output. It is embedded in an XML
document as specified by the text
document format.
2.1. Configuration
The configuration of the standard converters consists of the following optional elements:
Element | Purpose | Default |
---|---|---|
method | XSLT output method (one of xml , html , xhtml or text )
|
xml , html or text , depending on the serializer
|
content-type | Content type hint specified on the output document element
|
Specific to each serializer |
encoding | Encoding hint specified on the output document element
|
utf-8 |
version | HTML or XML version number | 4.01 for HTML (ignored for XML, which always output 1.0) |
public-doctype | The public doctype | "-//W3C//DTD HTML 4.01 Transitional//EN" for HTML, none otherwise |
system-doctype | The system doctype | "http://www.w3.org/TR/html4/loose.dtd" for HTML, none otherwise |
omit-xml-declaration | Specifies whether an XML declaration must be omitted | false for XML and HTML (i.e. a declaration is output by default), ignored otherwise |
standalone |
If true, specifies standalone="yes" in the document
declaration. If false, specifies standalone="no" in the
document declaration. If missing, no standalone attribute is produced.
For more information about standalone document declarations, please
refer to the relevant
section of the XML specification. In most cases, this does not need
to be specified.
|
not specified for XML, ignored otherwise |
indent |
Specifies if the output is indented. This means that line breaks maybe
be inserted between adjacent elements. The actual level of indentation
is specified with the indent-amount configuration element.
|
true (ignored for text method) |
indent-amount | Specifies the number of indentation space | 1 (ignored for text method) |
Example:
2.2. XML Converter
The XML converter outputs an XML document conform to the XSLT xml
semantic. By default, the output is indented with no spaces and encoded using
the UTF-8 character set. The default MIME content type is
application/xml
. The following is a simple XML converter example:
This is an example of output produced by the XML converter:
2.3. HTML Converter
The HTML converter outputs an HTML document conform to the XSLT
html
semantic. By default, the doctype
is set to HTML
4.0 Transitional and the content is indented with no space and encoded
using the UTF-8 character set. The default content type is
text/html
. The following is a simple HTML converter example:
This is an example of output produced by the HTML converter:
2.4. Text Converter
The Text converter outputs a text document conform to the XSLT text
semantic. By default, the output is encoded using the UTF-8 character set. This
serializer is typically useful for pipelines generating Comma Separated Value
(CSV) files. The default content type is text/plain
. The following
is a simple Text converter example:
This is an example of output produced by the Text converter:
3. To-XML Converter
The To-XML Converter produces parsed XML documents from a binary document format.
The data
input of the To-XML Converter follows the binary document format. Its
data
output is an XML document. The mandatory config
input
contains an empty config
element reserved for future configuration
parameters. This is an example of use:
4. XSL-FO Converter
The XSL-FO Converter produces PDF documents from an XSL-FO description of the page. The default
content type is application/pdf
.
The resulting binary stream is sent to the data
output. It is embedded
in an XML document as specified by the binary document format.
5. XLS Converters
Orbeon Forms ships with the POI library which allows import and export of Microsoft Excel documents. Orbeon Forms uses an Excel file template to define the layout of the spreadsheet. You define cells that will contain the values with a special markup.
5.1. Preparing the Spreadsheet
First, create an Excel spreadsheet with the formatting of your choosing. Apply a special markup to the cell you need to export values to:
- Select the cell
- Go to the menu
Format->Cell
-
In the
Number
tab, choose theCustom
format and enter a format that looks like:#,##0;"/a/b|/c/d"
. In this example we have 2 XPath expressions separated by a pipe character (|
):/a/b
and/c/d
. The first XPath expression is used when creating the Excel file (exporting) and is run against thedata
input document of the To XLS converter. The second expression is optional and is used when recreating an XML document from the Excel file (importing with the From XLS converter).
5.2. To XLS Converter
The To XLS converter takes a config
input describing the XLS
template file, and a data
input containing the values to be
inserted in the template. The processor scans the template, and applies XPath
expressions to fill in the template. It returns a binary document on it
data
output.
The config
input takes a
single config
element with one attribute:
template
|
A URL pointing to an XLS template file |
---|
The config
element can also contain zero or more
repeat-row
elements with two attributes, row-num
and
for-each
.
The To XLS converter is typically connected to the HTTP serializer. This allows
specifying headers such as Content-Disposition
:
5.3. From XLS Converter
The From XLS converter takes an Excel file (for example uploaded with an XForms
upload control), finds special markup cells and reconstructs an XML document
from this markup. The converter has one data
input which must
receive a binary document, and
a data
output containing the generated XML document. Assume the
following XForms model:
The model can be filled with the following XForms controls:
Then the following pipeline can extract the data from the uploaded file:
This is an example of returned document, given an appropriate configuration of the Excel template: