summaryrefslogtreecommitdiff
path: root/epan/charsets.c
AgeCommit message (Collapse)AuthorFilesLines
2017-06-16Fix SURROGATE_VALUE() to match what RFC 2781 says.Guy Harris1-0/+2
While we're at it, note in the comment for get_utf_16_string() the "decoding UTF-16" algorithm in RFC 2781. Change-Id: I5d7dc5c09af0474c055796e49e0c7b94fa87d2ad Reviewed-on: https://code.wireshark.org/review/22171 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2016-12-12Rename non-EBCDIC-specific routines.Guy Harris1-1/+1
Those routines can handle any single-byte character set whose characters map to characters in the Basic Multilingual Plane; it could be used for extended ASCII, but we have another routine for that, mapping only characters with code points > 0x7f, so we just say "nonascii" rather than "ebcdic". Change-Id: I3d55b5d58e3e7ab08f3dfbfdb57a0301a30e71d4 Reviewed-on: https://code.wireshark.org/review/19214 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2016-12-12Fix handling of EBCDIC string fields.Guy Harris1-37/+116
Have a routine that takes a 256-element translation table and uses it to map various flavors of EBCDIC to Unicode. Have separate translation tables for "common" EBCDIC (everything that's the same in all EBCDIC code pages that include the original EBCDIC characters) and EBCDIC code page 037. Add ENC_EBCDIC_CP037 for code page 037. Change-Id: Ia882b3c0abef9e30eb54cd47396e6fa0d6342044 Reviewed-on: https://code.wireshark.org/review/19212 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2016-10-22Add T.61 character set supportPascal Quantin1-0/+294
Bug: 13032 Change-Id: I6bf2cc2c43a6262d899a304df6576d9831115966 Reviewed-on: https://code.wireshark.org/review/18350 Petri-Dish: Michael Mann <mmann78@netscape.net> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Reviewed-by: Michael Mann <mmann78@netscape.net>
2014-09-21Fix dissection of 7 bits ASCII/GSM strings when the bit offset is not byte ↵Pascal Quantin1-8/+10
aligned Bug: 10491 Change-Id: Ib55d83b7739050ba5afd84e8184af3c4608d5776 Reviewed-on: https://code.wireshark.org/review/4228 Tested-by: Pascal Quantin <pascal.quantin@gmail.com> Petri-Dish: Pascal Quantin <pascal.quantin@gmail.com> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Reviewed-by: Pascal Quantin <pascal.quantin@gmail.com>
2014-04-25Use 4-space indentation consistently in epan/charsets.c.Guy Harris1-314/+314
Make the EBCDIC <-> ASCII translation tables const, while we're at it. Change-Id: I15a08f7329fd32f758cf36898fe4214ae8540462 Reviewed-on: https://code.wireshark.org/review/1343 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-04-25Add a get_ebcdic_string() routine, similar to other get_XXX_string() routines.Guy Harris1-128/+153
Use it in epan/tvbuff.c. Do some other cleanups while we're at it. Change-Id: I7aed37a568373b896aacfd23f986d445b58b77b7 Reviewed-on: https://code.wireshark.org/review/1342 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-04-25Move the XXX-to-UTF-8 loops to routines in epan/charsets.c.Guy Harris1-46/+482
This moves a bunch of character set knowledge into epan/charsets.c. Change-Id: Ieb79dcaac9753c77703af756b666ad2ca9385d9e Reviewed-on: https://code.wireshark.org/review/1339 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-04-25Move GSM guint8 to unicode conversion functions to charsets.cJakub Zawadzki1-3/+60
charsets.c is already place with huge number of conversion tables. Also make gsm_default_alphabet gunichar2, all values fits in 2 bytes. Change-Id: Ia5ab6c176b4fec21ec76b06513c1d00794ba10ef Reviewed-on: https://code.wireshark.org/review/1328 Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-04-12Add Mac Roman and DOS CP437.Guy Harris1-0/+40
Change-Id: Ib96f2cf4ea71cd0cc2c703d58b9d254bf4c1248a Reviewed-on: https://code.wireshark.org/review/1077 Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-03-04Remove all $Id$ from top of fileAlexis La Goutte1-2/+0
(Using sed : sed -i '/^ \* \$Id\$/,+1 d') Fix manually some typo (in export_object_dicom.c and crc16-plain.c) Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8 Reviewed-on: https://code.wireshark.org/review/497 Reviewed-by: Anders Broman <a.broman58@gmail.com>
2013-12-21Add the rest of ISO-8859-n, thanks to Jakub's "generate a mapping table"Guy Harris1-0/+220
program. Put the character-encoding cases in order. svn path=/trunk/; revision=54344
2013-12-18Add charset table for ISO/IEC 8859-9 (ENC_ISO_8859_9)Jakub Zawadzki1-0/+20
svn path=/trunk/; revision=54239
2013-12-15add support for ISO 8859-5Martin Kaiser1-0/+20
svn path=/trunk/; revision=54132
2013-12-15as requested, move the functions/defines for DVB character tablesMartin Kaiser1-202/+0
to separate files svn path=/trunk/; revision=54113
2013-12-09TABs -> spacesMartin Kaiser1-52/+65
add editor modelines svn path=/trunk/; revision=53888
2013-12-09From JakubMartin Kaiser1-0/+203
support DVB-SI character tables (EN 300 468) in a generic way From me move things to charsets.c/.h distinguish between single and multi byte encoding for some tables (so that the highlighted bytes match the displayed value) no character table byte -> length 0, use default table svn path=/trunk/; revision=53886
2013-12-08Encoding table for ISO/IEC 8859-2: make code points in the range 0x80-0x9F ↵Jakub Zawadzki1-8/+6
map to 0x80-0x9F (Guy Harris). svn path=/trunk/; revision=53865
2013-12-08Add ENC_ISO_8859_1.Guy Harris1-0/+2
Move the Wikipedia links for the code page layouts in front of the tables whose contents reflect the code page layouts. svn path=/trunk/; revision=53837
2013-12-07Note what the two new character encoding tables in charsets.c are.Guy Harris1-0/+6
svn path=/trunk/; revision=53833
2013-12-07Add string encoding for ISO/IEC 8859-2 (ENC_ISO_8859_2)Jakub Zawadzki1-0/+20
svn path=/trunk/; revision=53826
2013-12-07Add new string proto encoding for windows-1250 (ENC_WINDOWS_1250)Jakub Zawadzki1-0/+23
- Move windows-1250 to unicode encoding table to charset.c - Add tvb_get_string_unichar2, tvb_get_stringz_unichar2 functions which recode tvb-string to UTF-8. svn path=/trunk/; revision=53819
2012-09-20We always HAVE_CONFIG_H so don't bother checking whether we have it or not.Jeff Morriss1-3/+1
svn path=/trunk/; revision=45016
2012-06-28Update Free Software Foundation address.Jakub Zawadzki1-1/+1
(COPYING will be updated in next commit) svn path=/trunk/; revision=43536
2011-07-12Add a bunch of URLs for character encoding information.Guy Harris1-0/+27
svn path=/trunk/; revision=37986
2006-05-21name changeRonnie Sahlberg1-2/+2
svn path=/trunk/; revision=18197
2004-09-10Move the stuff to handle ASCII <-> EBCDIC conversions toGuy Harris1-0/+144
"epan/charsets.c"; other character set translation code should perhaps go there as well. svn path=/trunk/; revision=11958