Discussion:
[Classpathx-javamail] UnsupportedEncodingException for unicode-1-1-utf-7
Boris Folgmann
2009-11-27 12:38:22 UTC
Permalink
Hi,

I get this exception when parsing an email:

java.io.UnsupportedEncodingException: unicode-1-1-utf-7 at
sun.io.Converters.getConverterClass(Converters.java:218) at
sun.io.Converters.newConverter(Converters.java:251) at
sun.io.ByteToCharConverter.getConverter(ByteToCharConverter.java:68) at
sun.nio.cs.StreamDecoder$ConverterSD.<init>(StreamDecoder.java:224) at
sun.nio.cs.StreamDecoder$ConverterSD.<init>(StreamDecoder.java:210) at
sun.nio.cs.StreamDecoder.forInputStreamReader(StreamDecoder.java:77) at
java.io.InputStreamReader.<init>(InputStreamReader.java:83) at
gnu.mail.handler.Text.getContent(Text.java:106) at
javax.activationDataSourceDataContentHandler.getContent(DataHandler.java:803)
at javax.activation.DataHandler.getContent(DataHandler.java:550) at
javax.mail.internet.MimeBodyPart.getContent(MimeBodyPart.java:691) at
gnu.mail.providers.imap.IMAPBodyPart.getContent(IMAPBodyPart.java:282)

The encoding might be unsupported by JavaSE 5.0, but as it is a common
encoding of emails, as you can read in RFC1642 (UTF-7 - A Mail-Safe
Transformation Format of Unicode), I don't see why javamail shouldn't
support it.

Is it possible to implement a dirty work-around like forcing the BodyPart's
encoding to US-ASCII before calling getContent()? I couldn't find a
suitable method.

cu,
boris
Chris Burdess
2009-11-27 14:28:22 UTC
Permalink
java.io.UnsupportedEncodingException: unicode-1-1-utf-7 at sun.io.Converters.getConverterClass(Converters.java:218) at sun.io.Converters.newConverter(Converters.java:251) at sun.io.ByteToCharConverter.getConverter(ByteToCharConverter.java:68) at sun.nio.cs.StreamDecoder$ConverterSD.<init>(StreamDecoder.java:224) at sun.nio.cs.StreamDecoder$ConverterSD.<init>(StreamDecoder.java:210) at sun.nio.cs.StreamDecoder.forInputStreamReader(StreamDecoder.java:77) at java.io.InputStreamReader.<init>(InputStreamReader.java:83) at gnu.mail.handler.Text.getContent(Text.java:106) at javax.activationDataSourceDataContentHandler.getContent(DataHandler.java:803) at javax.activation.DataHandler.getContent(DataHandler.java:550) at javax.mail.internet.MimeBodyPart.getContent(MimeBodyPart.java:691) at gnu.mail.providers.imap.IMAPBodyPart.getContent(IMAPBodyPart.java:282)
The encoding might be unsupported by JavaSE 5.0, but as it is a common encoding of emails, as you can read in RFC1642 (UTF-7 - A Mail-Safe Transformation Format of Unicode), I don't see why javamail shouldn't support it.
Is it possible to implement a dirty work-around like forcing the BodyPart's encoding to US-ASCII before calling getContent()? I couldn't find a suitable method.
UTF7 is supported, however I have never heard of anything being identified as "unicode-1-1-utf-7". Do you have a copy of the original message to hand?
--
Chris Burdess
Boris Folgmann
2009-11-27 16:16:00 UTC
Permalink
Hi Chris,
Post by Chris Burdess
UTF7 is supported, however I have never heard of anything being
identified as "unicode-1-1-utf-7". Do you have a copy of the original
message to hand?
Cite from http://www.faqs.org/rfcs/rfc1642.html

-------------------------------------------------------------------------
Use of Character Set UTF-7 Within MIME

Character set UTF-7 is safe for mail transmission and therefore may
be used with any content transfer encoding in MIME (except where line
length and line break restrictions are violated). Specifically, the 7
bit encoding for bodies and the Q encoding for headers are both
acceptable. The MIME character set identifier is UNICODE-1-1-UTF-7.

Example. Here is a text portion of a MIME message containing the
Unicode sequence "Hi Mom <WHITE SMILING FACE>!" (hexadecimal 0048,
0069, 0020, 004D, 006F, 004D, 0020, 263A, 0021).

Content-Type: text/plain; charset=UNICODE-1-1-UTF-7

Hi Mom +Jjo-!
[...]
-------------------------------------------------------------------------

cu,
boris
Chris Burdess
2009-11-27 16:26:16 UTC
Permalink
Post by Boris Folgmann
Hi Chris,
Post by Chris Burdess
UTF7 is supported, however I have never heard of anything being
identified as "unicode-1-1-utf-7". Do you have a copy of the original
message to hand?
Cite from http://www.faqs.org/rfcs/rfc1642.html
-------------------------------------------------------------------------
Use of Character Set UTF-7 Within MIME
Character set UTF-7 is safe for mail transmission and therefore may
be used with any content transfer encoding in MIME (except where line
length and line break restrictions are violated). Specifically, the 7
bit encoding for bodies and the Q encoding for headers are both
acceptable. The MIME character set identifier is UNICODE-1-1-UTF-7.
Example. Here is a text portion of a MIME message containing the
Unicode sequence "Hi Mom <WHITE SMILING FACE>!" (hexadecimal 0048,
0069, 0020, 004D, 006F, 004D, 0020, 263A, 0021).
Content-Type: text/plain; charset=UNICODE-1-1-UTF-7
Hi Mom +Jjo-!
[...]
-------------------------------------------------------------------------
Ah, my bad. UTF7 is only supported at the IMAP protocol level, not at the MIME encoding level. If you're interested in developing this functionality, the class gnu.inet.imap.UTF7imap may be helpful.
--
Chris Burdess
Continue reading on narkive:
Search results for '[Classpathx-javamail] UnsupportedEncodingException for unicode-1-1-utf-7' (Questions and Answers)
3
replies
How to check for full-width characters in java?
started 2009-06-11 06:07:40 UTC
programming & design
Loading...