Xml with special character, utf-8 encoding

I have some simple questions because I got confused reading all the difference answers.

1) If I have an xml with a break: <?xml version="1.0" encoding="utf-8" ?>

and I am going to decouple it from Java (ex: JaXB). I suppose I cannot insert CROSS OF LORRAINE ( http://www.fileformat.info/info/unicode/char/2628/index.htm ), but can I put "\ u2628", right?

2) I also heard that UTF-8 does not contain it, but everything in Unicode can be saved with UTF-8 (or UTF-16) encoding, and here is an example from this page:

UTF-8 (hex) 0xE2 0x98 0xA8 (e298a8)

Am I thinking correctly? Can I use this form and put it in xml using utf-8 encoding?

+2


a source to share


3 answers


This should be absolutely fine - UTF-8 can encode any Unicode character.

XML has some limitations around control characters (U + 0000 to U + 001F), but U + 2628 must be precise.



(Personally, I prefer to go to unicode.org for the final code diagrams, but U + 2628 definitely shows up here .)

You don't have to worry about UTF-8 things - you should be able to put a character in your data directly and let JAXB do the encoding.

+1


a source


If your prologue defining utf-8 encoding for xml:

<?xml version="1.0" encoding="utf-8" ?>

      



then you can use utf-8 characters directly, or you can encode them as & # 9768;

+2


a source


Another addition ...

just specifying the encoding in the prologue is not enough. u need to make sure the content is serialized using the correct encoding.

+1


a source







All Articles