'From Squeak3.8-Nihongo of 25 February 2005 [latest update: #2] on 24 February 2005 at 10:32:22 pm'! "Change Set: commentsFeb24 Date: 24 February 2005 Author: Yoshiki Ohshima Comments."! !CompoundTextConverterState commentStamp: '' prior: 0! This represents the state of CompoundTextConverter.! !LanguageEnvironment commentStamp: '' prior: 0! The name multilingualized Squeak suggests that you can use multiple language at one time. This is true, of course, but the system still how to manage the primary language; that provides the interpretation of data going out or coming in from outside world. It also provides how to render strings, as there rendering rule could be different in one language to another, even if the code points in a string is the same. Originally, LanguageEnvironment and its subclasses only has class side methods. After merged with Diego's Babel work, it now has instance side methods. Since this historical reason, the class side and instance side are not related well. When we talk about the interface with the outside of the Squeak world, there are three different "channels"; the keyboard input, clipboard output and input, and filename. On a not-to-uncommon system such as a Unix system localized to Japan, all of these three can have (and does have) different encodings. So we need to manage them separately. Note that the encoding in a file can be anything. While it is nice to provide a suggested guess for this 'default system file content encoding', it is not critical. Rendering support is limited basic L-to-R rendering so far. But you can provide different line-wrap rule, at least. ! !GreekEnvironment commentStamp: '' prior: 0! This class provides the support for Greek. It is here, but most of the methods are not implemented yet. ! !JapaneseEnvironment commentStamp: '' prior: 0! This class provides the Japanese support. Since it has been used most other than default 'latin-1' languages, this tends to be a good place to look at when you want to know what a typical subclass of LanguageEnvironment should do. ! !KoreanEnvironment commentStamp: '' prior: 0! This class provides the Korean support. Unfortunately, we haven't tested this yet. We did have a working version in previous implementations, but not this new implementation. But as soon as we find somebody who understand the language, probably we can make it work in two days or so, as we have done for Czech support.! !Latin1Environment commentStamp: '' prior: 0! This class provides the support for the languages in 'Latin-1' category. Although we could have different language environments for different languages in the category, so far nobody seriously needed it. ! !Latin2Environment commentStamp: '' prior: 0! This class provides the support for the languages in 'Latin-2' category. Although we could have different language environments for different languages in the category, so far nobody seriously needed it. I (Yoshiki) don't have good knowledge in these language, so when Pavel Krivanek volunteered to implement the detail, it was a good test to see how flexible my m17n framework was. There are a few glitches, but with several email conversations over a few days, we managed to make it work relatively painlessly. I thought this went well. There seem that some source of headache, as Windows doesn't exactly use Latin-2 encoded characters, but a little modified version called 'code page 1250'. Similar to Japanese support, the encode interpreters are swapped based on the type of platform it is running on. ! !MultiByteBinaryOrTextStream commentStamp: '' prior: 0! It is similar to MultiByteFileStream, but works on in memory stream.! !MultiByteFileStream commentStamp: '' prior: 0! The central class to access the external file. The interface of this object is similar to good old StandardFileStream, but internally it asks the converter, which is a sub-instance of TextConverter, and do the text conversion. It also combined the good old CrLfFileStream. CrLfFileStream class>>new now returns an instance of MultiByteFileStream. There are several pitfalls: * You always have to be careful about the binary/text distinction. In #text mode, it usually interpret the bytes. * A few file pointer operations treat the file as uninterpreted byte no matter what. This means that if you use 'fileStream skip: -1', 'fileStream position: x', etc. in #text mode, the file position can be in the middle of multi byte character. If you want to implement some function similar to #peek for example, call the saveStateOf: and restoreStateOf: methods to be able to get back to the original state. * #lineEndConvention: and #wantsLineEndConversion: (and #binary) can cause some puzzling situation because the inst var lineEndConvention and wantsLineEndConversion are mutated. If you have any suggestions to clean up the protocol, please let me know.! !SimplifiedChineseEnvironment commentStamp: '' prior: 0! This class provides the Simplified Chinese support (Used mainly in Mainland China). Unfortunately, we haven't tested this yet, but as soon as we find somebody who understand the language, probably we can make it work in two days or so, as we have done for Czech support.! !SparseLargeTable commentStamp: '' prior: 0! Derivated from Stephan Pair's LargeArray, but to hold a sparse table, in which most of the entries are the same default value, it uses some tricks.! !TextConverter commentStamp: '' prior: 0! The abstract class for all different type of text converters. nextFromStream: and nextPut:toStream: are the public accessible methods. If you are going to make a subclass for a stateful text conversion, you should override restoreStateOf:with: and saveStateOf: along the line of CompoundTextConverter. ! !CP1250TextConverter commentStamp: '' prior: 0! Text converter for CP1250. Windows code page used in Eastern Europe.! !CP1253TextConverter commentStamp: '' prior: 0! Text converter for CP1253. Windows code page used for Greek.! !CompoundTextConverter commentStamp: '' prior: 0! Text converter for X Compound Text.! !EUCTextConverter commentStamp: '' prior: 0! Text converter for Extended Unix Character. This is an abstract class. The CJK variations are implemented as subclasses.! !CNGBTextConverter commentStamp: '' prior: 0! Text converter for Simplified Chinese variation of EUC. (Even though the name doesn't look so, it is what it is.)! !EUCJPTextConverter commentStamp: '' prior: 0! Text converter for Japanese variation of EUC.! !EUCKRTextConverter commentStamp: '' prior: 0! Text converter for Korean variation of EUC.! !ISO88592TextConverter commentStamp: '' prior: 0! Text converter for ISO 8859-2. An international encoding used in Eastern Europe.! !ISO88597TextConverter commentStamp: '' prior: 0! Text converter for ISO 8859-7. An international encoding used for Greek.! !Latin1TextConverter commentStamp: '' prior: 0! Text converter for ISO 8859-1. An international encoding used in Western Europe.! !MacRomanTextConverter commentStamp: '' prior: 0! Text converter for Mac Roman. An encoding used for the languages originated from Western Europe area.! !ShiftJISTextConverter commentStamp: '' prior: 0! Text converter for Shift-JIS. Mac and Windows in Japanese mode use this encoding.! !UTF16TextConverter commentStamp: '' prior: 0! Text converter for UTF-16. It supports the endianness and byte order mark.! !UTF8TextConverter commentStamp: '' prior: 0! Text converter for UTF-8. Since the BOM is used to distinguish the MacRoman code and UTF-8 code, BOM is written for UTF-8 by #writeBOMOn: which is called by client.! !UTF8TextConverter class reorganize! ('accessing' writeBOMOn:) ('utilities' encodingNames) ! !UTF8TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('accessing' forceToEncodingTag forceToEncodingTag:) ('friend' currentCharSize leadingChar) ! !UTF16TextConverter class reorganize! ('utilities' encodingNames) ! !UTF16TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('accessing' useByteOrderMark useByteOrderMark: useLittleEndian useLittleEndian:) ('private' charFromStream:withFirst: next16BitValue:toStream:) ! !ShiftJISTextConverter class reorganize! ('utilities' encodingNames) ! !MacRomanTextConverter class reorganize! ('utilities' encodingNames) ! !Latin1TextConverter class reorganize! ('utilities' encodingNames) ! !ISO88597TextConverter class reorganize! ('class initialization' initialize) ('utilities' encodingNames) ! !ISO88592TextConverter class reorganize! ('class initialization' initialize) ('utilities' encodingNames) ! !EUCKRTextConverter class reorganize! ('utilities' encodingNames) ! !EUCJPTextConverter class reorganize! ('utilities' encodingNames) ! !CompoundTextConverter class reorganize! ('utilities' encodingNames) ! !CP1253TextConverter class reorganize! ('utilities' encodingNames) ! !CP1250TextConverter class reorganize! ('class initialization' initialize) ('utilities' encodingNames) ! CNGBTextConverter class removeSelector: #example1! CNGBTextConverter class removeSelector: #example2! !CNGBTextConverter class reorganize! ('utilities' encodingNames) ! !TextConverter class reorganize! ('as yet unclassified') ('instance creation' default defaultConverterClassForEncoding: defaultSystemConverter newForEncoding:) ('utilities' allEncodingNames encodingNames) ! !ShiftJISTextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('private' katakanaValue: sjisKatakanaFor: toUnicode:) ('friend' leadingChar) ! !MacRomanTextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('friend' currentCharSize leadingChar) ! !Latin1TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('friend' currentCharSize) ! !ISO88597TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('private' fromSqueak: toSqueak:) ! !ISO88592TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('private' fromSqueak: toSqueak:) ! !EUCKRTextConverter reorganize! ('private' languageEnvironment leadingChar) ! !EUCJPTextConverter reorganize! ('private' languageEnvironment leadingChar) ! !CNGBTextConverter reorganize! ('private' languageEnvironment leadingChar) ! !EUCTextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('friend' restoreStateOf:with: saveStateOf:) ('private' languageEnvironment leadingChar nonUnicodeClass) ! CompoundTextConverter removeSelector: #errorMalformedInput! CompoundTextConverter removeSelector: #errorUnsupported! !CompoundTextConverter reorganize! ('initialize-release' initialize) ('conversion' nextFromStream: nextPut:toStream:) ('friend' currentCharSize emitSequenceToResetStateIfNeededOn: restoreStateOf:with: saveStateOf:) ('query' accepts:) ('private' nextPutValue:toStream:withShiftSequenceIfNeededForLeadingChar: parseShiftSeqFromStream:) ! !CP1253TextConverter reorganize! ('conversion' nextFromStream:) ('private' toSqueak:) ! !CP1250TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('private' fromSqueak: toSqueak:) ! TextConverter removeSelector: #errorMalformedInput! !TextConverter reorganize! ('conversion' nextFromStream: nextPut:toStream:) ('friend' currentCharSize emitSequenceToResetStateIfNeededOn: restoreStateOf:with: saveStateOf:) ('query' accepts:) ! !SimplifiedChineseEnvironment class reorganize! ('language methods' beCurrentNaturalLanguage traditionalCharsetClass) ('subclass responsibilities' clipboardInterpreterClass inputInterpreterClass supportedLanguages) ('public query' defaultEncodingName) ! !MultiByteFileStream class reorganize! ('class initialization' defaultToCR defaultToCRLF defaultToLF guessDefaultLineEndConvention initialize startUp) ('instance creation' newFrom:) ! !MultiByteFileStream reorganize! ('accessing' ascii binary converter converter: fileInEncodingName: lineEndConvention lineEndConvention: wantsLineEndConversion:) ('public' next next: nextDelimited: nextMatchAll: nextPut: nextPutAll: peek peekFor: skipSeparators skipSeparatorsAndPeekNext upTo: upToEnd) ('crlf private' bareNext convertStringFromCr: convertStringToCr: detectLineEndConvention doConversion next:innerFor: wantsLineEndConversion) ('private basic' basicNext: basicNext:into: basicNextInto: basicNextPut: basicNextPutAll: basicPeek basicPosition basicPosition: basicReadInto:startingAt:count: basicSetToEnd basicSkip: basicUpTo: basicVerbatim:) ('open/close' open:forWrite: reset) ('remnant' accepts: filterFor:) ('private' setConverterForCode) ('fileIn/Out' fileIn fileOutClass:andObject:) ! !MultiByteBinaryOrTextStream reorganize! ('accessing' ascii binary converter converter: isBinary text) ('public' contents next next: nextDelimited: nextMatchAll: nextPut: nextPutAll: padToEndWith: peek peekFor: reset skipSeparators skipSeparatorsAndPeekNext upTo: upToEnd) ('private basic' basicNext basicNext: basicNext:into: basicNextInto: basicNextPut: basicNextPutAll: basicPeek basicPosition basicPosition:) ('converting' asBinaryOrTextStream) ('private' guessConverter) ('fileIn/Out' fileIn fileInObjectAndCode fileOutClass:andObject: setConverterForCode setEncoderForSourceCodeNamed:) ('properties-setting' setFileTypeToObject) ! !JapaneseEnvironment class reorganize! ('language methods' beCurrentNaturalLanguage flapTabTextFor:in: fromJISX0208String: removeFonts scanSelector traditionalCharsetClass) ('subclass responsibilities' clipboardInterpreterClass fileNameConverterClass inputInterpreterClass leadingChar supportedLanguages systemConverterClass) ('public query' defaultEncodingName) ('rendering support' isBreakableAt:in:) ! !GreekEnvironment class reorganize! ('subclass responsibilities' leadingChar supportedLanguages) ! !CompoundTextConverterState class reorganize! ('instance creation' g0Size:g1Size:g0Leading:g1Leading:charSize:streamPosition:) ! !CompoundTextConverterState reorganize! ('accessing' charSize charSize: g0Leading g0Leading: g0Size g0Size: g0Size:g1Size:g0Leading:g1Leading:charSize:streamPosition: g1Leading g1Leading: g1Size g1Size: printOn: streamPosition streamPosition:) ! Smalltalk removeClassNamed: #FileOutFormatTest!