标签归档:XML标签

获取XML标签内容

获取XML标签内容:
# cat sample.xml

<?xml version="1.0"?> 
<catalog> 
   <book id="bk101"> 
      <author>Gambardella, Matthew</author> 
      <title>XML Developer's Guide</title> 
      <genre>Computer</genre> 
      <price>44.95</price> 
      <publish_date>2000-10-01</publish_date> 
      <description>An in-depth look at creating applications with XML.</description> 
   </book> 
   <book id="bk102"> 
      <author>Ralls, Kim</author> 
      <title>Midnight Rain</title> 
      <genre>Fantasy</genre> 
      <price>5.95</price> 
      <publish_date>2000-12-16</publish_date> 
      <description>A former architect battles corporate zombies,  
      an evil sorceress, and her own childhood to become queen  
      of the world.</description> 
   </book> 
   <book id="bk103"> 
      <author>Corets, Eva</author> 
      <title>Maeve Ascendant</title> 
      <genre>Fantasy</genre> 
      <price>5.95</price> 
      <publish_date>2000-11-17</publish_date> 
      <description>After the collapse of a nanotechnology  
      society in England, the young survivors lay the  
      foundation for a new society.</description> 
   </book> 
   <book id="bk104"> 
      <author>Corets, Eva</author> 
      <title>Oberon's Legacy</title> 
      <genre>Fantasy</genre> 
      <price>5.95</price> 
      <publish_date>2001-03-10</publish_date> 
      <description>In post-apocalypse England, the mysterious  
      agent known only as Oberon helps to create a new life  
      for the inhabitants of London. Sequel to Maeve  
      Ascendant.</description> 
   </book> 
   <book id="bk105"> 
      <author>Corets, Eva</author> 
      <title>The Sundered Grail</title> 
      <genre>Fantasy</genre> 
      <price>5.95</price> 
      <publish_date>2001-09-10</publish_date> 
      <description>The two daughters of Maeve, half-sisters,  
      battle one another for control of England. Sequel to  
      Oberon's Legacy.</description> 
   </book> 
</catalog> 

You want to pick up the stuff between the “<description>, </description>” tags.

The first occurrence is on a single line. The rest of them span multiple lines and you want the newlines to be preserved. I shall assume that you want the whitespaces to be preserved as well.

Here’s the script –

$
$ perl -lne 'BEGIN{undef $/} while (/<description>(.*?)<\/description>/sg){print $1}' sample.xml
An in-depth look at creating applications with XML.
A former architect battles corporate zombies,
      an evil sorceress, and her own childhood to become queen
      of the world.
After the collapse of a nanotechnology
      society in England, the young survivors lay the
      foundation for a new society.
In post-apocalypse England, the mysterious
      agent known only as Oberon helps to create a new life
      for the inhabitants of London. Sequel to Maeve
      Ascendant.
The two daughters of Maeve, half-sisters,
      battle one another for control of England. Sequel to
      Oberon's Legacy.
$
$

In case you want the newlines preserved, but want to remove the whitespace at the beginning, then –

$
$ perl -lne 'BEGIN{undef $/} while (/<description>(.*?)<\/description>/sg){($x = $1) =~ s/\n\s*/\n/g; print $x}' sample.xml
An in-depth look at creating applications with XML.
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.
After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.
In post-apocalypse England, the mysterious
agent known only as Oberon helps to create a new life
for the inhabitants of London. Sequel to Maeve
Ascendant.
The two daughters of Maeve, half-sisters,
battle one another for control of England. Sequel to
Oberon's Legacy.
$
$

And in case you want to neither the newline nor the whitespace i.e. each chunk between “<description>” tags on a single line, then –

$
$ perl -lne 'BEGIN{undef $/} while (/<description>(.*?)<\/description>/sg){($x = $1) =~ s/\n\s*//g; print $x}' sample.xml
An in-depth look at creating applications with XML.
A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.
After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.
In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant.
The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy.
$
$