<...">

Using sed with html data

I am having trouble using sed in combination with html. The following example illustrates the problem:

HTML="<html><body>ENTRY</body><html>"
TABLE="<table></table>"
echo $HTML | sed -e s/ENTRY/$TABLE/

      

Output:

sed: -e expression #1, char 18: unknown option to `s'

      

If I leave /

out $TABLE

to become <table><table>

, it works fine.

Any ideas on how to fix it?

Update
Here's a sample that can reproduce the problem:

template.html:

<html>
    <body>
        <table>
            ENTRIES
        </table>
    </body>
</html>

      

gui_template:

<tr>
  <td class="td_tut_title">TITLE</td>
  <td class="td_tut_content">
    <a href="../tutorials/GUI/FILENAME"><img src="img/bbp.png" alt="bbp" /></a>
  </td>
</tr>

      

genhtml.sh:

#!/bin/bash
HTML=`cat template.html`
ENTRIES=`cat gui_template | sed -e s/FILENAME/test/ | sed -e s/TITLE/title/`
DELIM=$'\377'
echo $HTML | sed -e "s${DELIM}ENTRIES${DELIM}$ENTRIES${DELIM}"

      

Output:

~/htmlgen $ ./genhtml.sh 
sed: -e expression #1, char 14: unterminated `s' command

      

+2


a source to share


3 answers


Use a different separator @, for example



echo $HTML | sed -e s@ENTRY@$TABLE@ 

      

+3


a source


Outputting these lines on the FreeBSD console:

HTML="<html><body>ENTRY</body></html>"
TABLE="<table></table>"
echo $HTML | sed -e "s#ENTRY#$TABLE#"

      



Result:

<html><body><table></table></body></html>

      

+1


a source


You need to use a delimiter that cannot appear in $ TABLE, and if $ TABLE is unpredictable it can be tricky. I would suggest using a non-printable character as a delimiter; it's easier to find one that doesn't appear in $ TABLE and breaks everything. The only problem is they are harder to type, so I would suggest putting it in a variable and using that in the sed command:

DELIM=$'\377'
HTML="<html><body>ENTRY</body><html>"
TABLE="<table></table>"
echo "$HTML" | sed -e "s${DELIM}ENTRY${DELIM}$TABLE${DELIM}"

      

Note that the construct $'...'

is a bash function; if you need this to run under a shared sh you need to do something dirtier, eg DELIM="$(printf "\377")"

. Also, I chose \ 377 (this is FF in hex) as it is illegal in UTF-8, so it should be safe if you are using UTF-8 for your HTML; if you are using something else like Windows-1252 then \ 177 (the "DEL" character) might be a safer choice.

Oh yes, and if you try to debug this with help bash -x

, be prepared for a comedy.

0


a source







All Articles