Index of Section 1 Manual Pages

Interix / SUAxmlwf.1Interix / SUA

XMLWF(1)                                                 XMLWF(1)



NAME
       xmlwf - Determines if an XML document is well-formed

SYNOPSIS
       xmlwf [ -s]  [ -n]  [ -p]  [ -x]  [ -e encoding]  [ -w]  [
       -d output-dir]  [ -c]  [ -m]  [ -r]  [ -t]  [ -v]  [  file
       ...]


DESCRIPTION
       xmlwf  uses the Expat library to determine if an XML docu-
       ment is well-formed.  It is non-validating.

       If you do not specify any files on the  command-line,  and
       you have a recent version of xmlwf, the input file will be
       read from standard input.

WELL-FORMED DOCUMENTS
       A well-formed document must adhere to the following rules:

       o The  file begins with an XML declaration.  For instance,
         .   NOTE:  xmlwf
         does not currently check for a valid XML declaration.

       o Every start tag is either empty () or has a corre-
         sponding end tag.

       o There is exactly one root element.   This  element  must
         contain  all  other elements in the document.  Only com-
         ments, white space, and processing instructions may come
         after the close of the root element.

       o All elements nest properly.

       o All attribute values are enclosed in quotes (either sin-
         gle or double).

       If the document has a DTD, and it strictly  complies  with
       that  DTD,  then  the  document  is also considered valid.
       xmlwf is a non-validating parser -- it does not check  the
       DTD.   However, it does support external entities (see the
       -x option).

OPTIONS
       When an option includes an argument, you may  specify  the
       argument  either  separately ("-d output") or concatenated
       with the option ("-doutput").  xmlwf supports both.

       -c     If the input file is well-formed and xmlwf  doesn't
              encounter  any  errors,  the  input  file is simply
              copied to the  output  directory  unchanged.   This
              implies  no  namespaces (turns off -n) and requires
              -d to specify an output file.

       -d output-dir
              Specifies a directory to contain transformed repre-
              sentations of the input files.  By default, -d out-
              puts a canonical representation (described  below).
              You  can  select  different output formats using -c
              and -m.

              The output filenames will be exactly  the  same  as
              the input filenames or "STDIN" if the input is com-
              ing from standard input.  Therefore,  you  must  be
              careful  that  the output file does not go into the
              same directory as the input file.  Otherwise, xmlwf
              will  delete the input file before it generates the
              output file (just like running cat < file > file in
              most shells).

              Two  structurally  equivalent  XML documents have a
              byte-for-byte identical canonical  XML  representa-
              tion.   Note  that ignorable white space is consid-
              ered significant and  is  treated  equivalently  to
              data.   More  on  canonical  XML  can  be  found at
              http://www.jclark.com/xml/canonxml.html .

       -e encoding
              Specifies the character encoding for the  document,
              overriding   any   document  encoding  declaration.
              xmlwf supports four built-in  encodings:  US-ASCII,
              UTF-8,  UTF-16,  and  ISO-8859-1.   Also see the -w
              option.

       -m     Outputs some strange sort of  XML  file  that  com-
              pletely  describes  the  the  input file, including
              character postitions.  Requires -d  to  specify  an
              output file.

       -n     Turns  on  namespace  processing.  (describe names-
              paces) -c disables namespaces.

       -p     Tells xmlwf to process external DTDs and  parameter
              entities.

              Normally xmlwf never parses parameter entities.  -p
              tells it to always parse them.  -p implies -x.

       -r     Normally xmlwf  memory-maps  the  XML  file  before
              parsing;  this can result in faster parsing on many
              platforms.  -r turns off  memory-mapping  and  uses
              normal  file  IO calls instead.  Of course, memory-
              mapping is automatically turned  off  when  reading
              from standard input.

              Use  of  memory-mapping can cause some platforms to
              report substantially higher memory usage for xmlwf,
              but  this  appears  to be a matter of the operating
              system reporting memory in a strange way; there  is
              not a leak in xmlwf.

       -s     Prints  an error if the document is not standalone.
              A document is standalone if it has no external sub-
              set and no references to parameter entities.

       -t     Turns  on  timings.   This tells Expat to parse the
              entire file, but not perform any processing.   This
              gives  a  fairly  accurate idea of the raw speed of
              Expat itself without client overhead.  -t turns off
              most of the output options (-d, -m, -c, ...).

       -v     Prints the version of the Expat library being used,
              including some information on the compile-time con-
              figuration of the library, and then exits.

       -w     Enables  support for Windows code pages.  Normally,
              xmlwf will throw an error  if  it  runs  across  an
              encoding  that it is not equipped to handle itself.
              With -w, xmlwf will try to use a Windows code page.
              See also -e.

       -x     Turns on parsing external entities.

              Non-validating  parsers are not required to resolve
              external entities, or even expand entities at  all.
              Expat  always  expands  internal  entities (?), but
              external entity parsing must be enabled explicitly.

              External  entities  are simply entities that obtain
              their data from  outside  the  XML  file  currently
              being parsed.

              This is an example of an internal entity:

              

              And here are some examples of external entities:

                (parsed)
                       (unparsed)

       --     (Two  hyphens.)   Terminates  the  list of options.
              This is only needed if a  filename  starts  with  a
              hyphen.  For example:

              xmlwf -- -myfile.xml

              will run xmlwf on the file -myfile.xml.

       Older  versions of xmlwf do not support reading from stan-
       dard input.

OUTPUT
       If an input file is not well-formed, xmlwf prints a single
       line describing the problem to standard output.  If a file
       is well formed, xmlwf  outputs  nothing.   Note  that  the
       result code is not set.

BUGS
       According  to the W3C standard, an XML file without a dec-
       laration at the beginning is not  considered  well-formed.
       However, xmlwf allows this to pass.

       xmlwf  returns a 0 - noerr result, even if the file is not
       well-formed.  There is no good way for a  program  to  use
       xmlwf  to  quickly  check  a file -- it must parse xmlwf's
       standard output.

       The errors should go to standard error, not standard  out-
       put.

       There  should  be  a  way  to get -d to send its output to
       standard output rather than forcing the user to send it to
       a file.

       I  have  no  idea why anyone would want to use the -d, -c,
       and -m options.  If someone could explain it  to  me,  I'd
       like to add this information to this manpage.

ALTERNATIVES
       Here are some XML validators on the web:

       http://www.hcrc.ed.ac.uk/~richard/xml-check.html
       http://www.stg.brown.edu/service/xmlvalid/
       http://www.scripting.com/frontier5/xml/code/xmlValidator.html
       http://www.xml.com/pub/a/tools/ruwf/check.html

SEE ALSO
       The Expat home page:        http://www.libexpat.org/
       The W3 XML specification:   http://www.w3.org/TR/REC-xml

AUTHOR
       This  manual  page  was  written  by  Scott Bronson  for the Debian GNU/Linux system (but  may
       be  used  by others).  Permission is granted to copy, dis-
       tribute and/or modify this document under the terms of the
       GNU Free Documentation License, Version 1.1.



                         24 January 2003                 XMLWF(1)

Interix / SUAHosted at SUA Community for Interix, SUA and SFUInterix / SUA