Text formatting with EMS

This post was written by Jeroen on March 12, 2009
Posted Under: SMS
This entry is part 12 of 17 in the series Sending out an SMS

We have seen how you can use the User Data Header (UDH) in an SMS message to combine several SMS messages into one bigger one. Here is an other application of the UDH:

Text formatting in SMS

Remember that the UDH consists of Information Elements (IE) that each have the following structure

Length Value Description
1 octet IEI Information Element Identifier. This determines what this IE is about.
1 octet IE length The length of the data belonging to this IE in octets.
n octets IE data Meaning of the content varies per IEI.

 

The text formatting is controlled by just one IE. Here is a description:

 

Length Value Description
1 octet IEI = 0x0A IEI 0x0A is used to describe text formatting.
1 octet IE length = 0×04 The length of the data is 0×03 or 0×04 depending on whether the color octet(below) is used.
1 octet start position The formatting described in the formatting octet (below) applies starting the character at the position indicated here.
1 octet length The formatting applies for the number of characters indicated by this octet. The value 0 means that the formatting is the new default formatting.
1 octet formatting This octet indicates:

  • How the text is to be aligned
  • The size of the text
  • Whether the text is bold, italic, underlined or strikethrough
1 octet color This octet is optional, its presence is through the IE length octet. This octet controls both background and foreground colors.

Now the only thing that is left to describe the formatting and color octets is more detail:

The formatting octet’s 8 bits have the following meaning:

Bits Value Meaning
1 and 0
(right most bits)
0 = 0×00
1 = 0×01
2 = 0×02
3 = 0×03
Left aligned
Center aligned
Right aligned
Default
3 and 2 0 = 0×00
1 = 0×04
2 = 0×08
3 = 0x0C
Normal size
Large text
Small text
Unused
4 0 = 0×00
1 = 0×10
Bold off
Bold on
5 0 = 0×00
1 = 0×20
Italic off
Italic on
6 0 = 0×00
1 = 0×40
Underline off
Underline on
7 0 = 0×00
1= 0×80
Strikethrough off
Strikethrough on

 

The values can be combined using the bitwise OR. For instance a formatting octet with value 0×39 would mean center aligned, small text, bold and italic.

The color octet consists of 2 4 bit values:

Bits Value Meaning
3-0 (right most bits) 0 = 0×00
1 = 0×01
2 = 0×02
3 = 0×03
4 = 0×04
5 = 0×05
6 = 0×06
7 = 0×07
8 = 0×08
9 = 0×09
10 = 0x0A
11 = 0x0B
12 = 0x0C
13 = 0x0D
14 = 0x0E
15 = 0x0F
Text black
Text dark grey
Text dark red
Text dark yellow
Text dark green
Text dark cyan
Text dark blue
Text dark magenta
Text grey
Text white
Text bright red
Text bright yellow
Text bright green
Text bright cyan
Text bright blue
Text bright magenta
7-4 (left most bits) 0 = 0×00
1 = 0×10
2 = 0×20
3 = 0×30
4 = 0×40
5 = 0×50
6 = 0×60
7 = 0×70
8 = 0×80
9 = 0×90
10 = 0xA0
11 = 0xB0
12 = 0xC0
13 = 0xD0
14 = 0xE0
15 = 0xF0
Background black
Background dark grey
Background dark red
Background dark yellow
Background dark green
Background dark cyan
Background dark blue
background dark magenta
Background grey
Background white
Background bright red
Background bright yellow
Background bright green
Background bright cyan
Background bright blue
background bright magenta

 

Again you can use a bitwise OR to combine any text and background color.

Now for a complete walkthrough of an example SMS with some text formatting. Here is the AT command to send the message:

AT+CMGS=100
0041000B915121551532F40000631A0A031906200A032104100A032705040A032E05080A0438
07002B8ACD29A85D9ECFC3E7F21C340EBB41E3B79B1E4EBB41697A989D1EB340E2379BCC02B1
C3F27399059AB7C36C3628EC2683C66FF65B5E2683E8653C1D

 

Here is the detailed analysis to help you understand it:

Size Value Description
1 octet 0×00 We don’t supply a SMSC number.
1 octet 0×41 PDU type and options. This is a plain SUBMIT-PDU and there is a UDH present.
1 octet 0×02 Our message reference.
1 octet 0x0B Size of destination telephone number (in digits)
1 octet 0×91 International numbering plan.
6 octets 0x5121551532f4 This represents the destination and it translates to 1 512 555 1234
1 octet 0×00 Protocol identifier.
1 octet 0×00 Data Coding Scheme. DCS 0×00 stands for a plain GSM-7 encoded text message.
1 octet 0×63 User Data Length or payload size (in septets).
1 octet 0x1A User Data Header Length or UDHL. Size of the UDH in octets.
1 octet 0x0A Start of a text formatting IEI.
1 octet 0×03 IE length is 3 octets
1 octet 0×19 Start formatting at the 25th character
1 octet 0×06 Formatting is for 6 characters
1 octet 0×20 Use italic text
1 octet 0x0A Start of a text formatting IEI.
1 octet 0×03 IE length is 3 octets
1 octet 0×21 Start formatting at the 33rd character
1 octet 0×04 Formatting is for 4 characters
1 octet 0×10 Use bold text
1 octet 0x0A Start of a text formatting IEI.
1 octet 0×03 IE length is 3 octets
1 octet 0×27 Start formatting at the 39th character
1 octet 0×05 Formatting is for 5 characters
1 octet 0×04 Use large text
1 octet 0x0A Start of a text formatting IEI.
1 octet 0×03 IE length is 3 octets
1 octet 0x2E Start formatting at the 46th character
1 octet 0×05 Formatting is for 5 characters
1 octet 0×08 Use small text
1 octet 0x0A Start of a text formatting IEI.
1 octet 0×04 IE length is 4 octets
1 octet 0×38 Start formatting at the 56th character
1 octet 0×07 Formatting is for 5 characters
1 octet 0×00 No text formatting
1 octet 0x2B Background dark red and text bright yellow
54 octets   The remainder of the octets contain the text:

“EMS messages can contain italic, bold, large, small and colored text”



The text produced by the messages (when received on a phone that supports all EMS features used) would look like this:

EMS messages can contain italic, bold, large, small and colored text

A phone that doesn’t support some or all of the EMS features used will simply skip over the IEs it doesn’t know and produce the text without the indicated markup.

Series NavigationSMS based applicationsGSM-7 Encoding with the GNU iconv library
Tags: , ,

Reader Comments

Hello. In this example,How are the last 54 octets coded?. Did you use the same encoding when you send a normal SMS and when you send an EMS?… Thank you for your response and congratulations for this blog.

#1 
Written By Fernando V. on June 9th, 2009 @ 5:35 pm

For the text encoding see the article “More on the SMS PDU”. I’ve linked this article for your convenience.
Good luck,
Jeroen

#2 
Written By Jeroen on June 9th, 2009 @ 7:14 pm

Hi! Could you explain why UDL is 0×63? This octet is UDL or number of septets (letters) but septets are 69 not 99(dec) = 63(hex). If 0×63 is UDL how to count it? I was trying but alweys counting more or less 99 ;)

PS. Great and usefull blog :) Sorry for my poor english ;)

#3 
Written By osa on July 19th, 2009 @ 12:48 pm

Hi Osa,

The User Data (UD) or payload includes both text and the User Data Header (UDH). Even the single octet that represents the User Data Header Length (UDHL) is part of the payload. It is always confusing to count all of this in septets, since the UDH consists of octets (not septets), but this is the way it is.

The 27 octets (216 bits) for UDHL + UDH occupy 31 septets (217 bits). The next 68 septets (characters) are for the text (which starts on a septet boundary). This makes 99 septets total (= 63 hex).

Does it make sense this way?

Good luck,
Jeroen

#4 
Written By Jeroen on July 19th, 2009 @ 2:22 pm

Hi Jeroen! It make sense :) In my calculation I forgot add UDH ;) It was only one octet but… was ;) Thanks a lot!

#5 
Written By osa on July 20th, 2009 @ 1:56 am

Hi Joren,
Thanks for this wonderful posting. One query, I am not able to figure out how you encoded the last 54 data octets i.e. ‘EMS messages can contain italic, bold, large, small and colored text’. I tried to encode it in the way you have said in your SMS PDU article but I get a different translation ‘C5E6…’ not ’8ACD29A…’ as you have mentioned. Am i missing something?

#6 
Written By sunny on August 18th, 2009 @ 2:00 am

Hi Sunny,

I am glad you find my posting useful.

Did you check out the article on how to pack GSM-7 characters into 7 bits? The issue you may be running into is that you’re not starting the text on a septet boundary. In my particular example I have a UDH of 27 octets (this is including the octet for the UDHL). These 27 octets take up 216 bits. This uses the space of 31 septets (=217 bits), so I needed to add 1 padding bit.

Jeroen

#7 
Written By Jeroen on August 18th, 2009 @ 10:17 am

Hi Jeroen,
Thanks for explaining that up. Let me have a new look at it based on the info you gave.
Also just like your articles on SMS/EMS format, will be great if you start something on MMS encoding as well. Waiting to read about that.
Good Day,
Sunny

#8 
Written By Sunny on August 18th, 2009 @ 10:07 pm

Hi Jeroen,

Do you know if all phones support EMS?
Do you have sample smpp log for a sm submit with ems content?

Thanks in advance for your assistance.

Alan

#9 
Written By Alan on June 1st, 2010 @ 7:29 am

Add a Comment

required, use real name
required, will not be published
optional, your blog address

Previous Post: