strings:Text2Xml

Previous Next

Converts a flat sequence of text characters into an XML document that represents punctuation marks and whitespace by special elements and attributes. This is a starting point for parsing the text for some purpose.

Syntax

strings:Text2Xml( text, 'boundary-aliases' )

 

boundary-aliases  ::=  page-alias line-alias element-alias

The required boundary aliases is a string of 3 character sequences separated by 2 spaces. For details, see the Example and the "Boundary aliases" section below.

 

Example

<root xmlns:pc="Processing.Command" pc:hideme="true">
  <pc:defs>
    URE..TEXT. Application
      MYAPP
  </pc:defs>
  <pc:evaluate select="strings:Text2Xml(//pc:defs, 'P l elem')"/>
</root>

This example returns the following XML document. For comments, see the sections below the XML document.

<P>
  <l>
    <elem ws="">URE</elem>
    <pm ws="">.</pm>
    <pm ws="">.</pm>
    <elem ws="">TEXT</elem>
    <pm ws="s">.</pm>
    <elem ws="ntt">Application</elem>
    <eol/>
  </l>
  <l>
    <elem ws="nt">MYAPP</elem>
    <eol/>
  </l>
  <l>
    <eof/>
  </l>
</P>

 

 

collapseBoundary aliases
collapsePunctuation marks

 

collapseWhitespace

 

collapseSpecial elements