What: SMS / Text Messaging Character Encoding Sets and Parameters
Links:
http://discussion.forum.nokia.com/forum/showthread.php?t=54717 (most of the information below is from this pithy post).
http://www.mobilecity.cz/doc/GSM_03.38_5.3.0.pdf (technical specification for SMS character encoding default language set). http://www.dreamfabric.com/sms/default_alphabet.html (clear chart of the 7-bit default SMS alphabet)
http://www.mblox.com/mblox/technology/technical_features.asp - MBLOX’s supported character set information
When: Apr. 16, 2007
Overview:
While developing SMS campaigns for partner organizations, I was often asked if it was possible to send non-English characters such as Spanish or Chinese characters. When I checked with MBLOX, they told me that it was possible to send non-English characters by using a different encoding. However, the encoding took up a lot more space which would leave just a little bit of space for the actual message. So I had a quick answer for our partners, but wanted to get to the bottom of it – so have done a bit of research and here it is:
- The character limit of an SMS message is actually a data limit.
- The protocol allows up to 140 bytes in each message.
- The default SMS character set is a 7-bit set defined in a in a technical specification by the European Telecommunications Standards Institute (GSM 03.38). It is not the same as ASCII. See links above for the default character set.
- It seems that there are Spanish characters in this character set, but I have not yet figured out how to properly encode them in my messages. If you know, please let me know. Alternatively, it may be that MBLOX does not implement the entire default set(?). I’ve tried inserting various character codes, but none seem to work.
- If you want to use characters, symbols, or glyphs that are not contained in this default character-set, you need to encode your messages in Unicode UCS2 which uses 2 bytes per character which means that each message can only hold 70 characters. The entire message must use the same encoding.
All of the above only applies to GSM based phones – as the data limits are particular to GSM’s SMS system. Longer messages are possible on Sprint’s CDMA-based messaging system, for example. But since you may be sending to a device on a GSM network, the GSM limits serve as the lowest common denominator.
Ref: 06