Very nice initial work on Chinook. I'm pleased to see someone doing work on this script. It's an important script historically in the Pacific Northwest, but it's always seemed difficult to obtain information. (I am not CCing this to the Unicode list to avoid random discussion there.) A few specific initial comments on your proposal. It might bear some explanation about why you chose circles in your font, instead of the oblique ovals always (?) found in manuscripts. All of the rules look complicated, so it could probably bear more explanation and diagrams. Especially the stuff about use of variation selectors. It could use some examples with actual text fragments, to determine whether it's justified and whether or not you really need two VSes, etc. Most or all of the cross references in the names list (with arrows ->) are not necessary. We usually only put in such references when the items are confusable, or when someone looking for one may come up with the other. E.g., for these script characters it's not necessary to cross reference so heavily. You needn't specify a preferred location in the SMP, it's unlikely to be moved from where it's now roadmapped, and this suggestion would serve to confuse some people in some committees, causing excessive "churn". Also, the 102xx row is occupied with *ancient* scripts and the empty space is likely to be eventually roadmapped for more of those. Chinook is currently grouped very tentatively with other "shorthand" scripts, though we're not really sure if any of those would ever be encoded. At some point, we might be able to assist with finding more scans for you, or connecting with L of C. I'm also CCing Deborah Anderson. I hope the above comments give you some idea of areas to concentrate on for the next phase of your work. Rick vanisaac@boil.afraid.org wrote: > Ever since I saw Chinook on the SMP roadmap, I've been dabbling with a proposal for including the Chinook script. The link below is what I have right now (along with about 3MB of gif images on my harddrive), and I'm looking for criticism, help, praise, and thoughts as to the wisdom of proceeding. > > > http://www.geocities.com/vanisaac/Chinook.html > > > > PS, If anyone has subscription access to the digital library Early Canadiana Online (canadiana.org), I would appreciate your assistance in obtaining some of their archived scans of the Wawa texts in the Chinook script. The Library of Congress has been annoyingly vague about the cost and possibility of obtaining scans of texts from their rare book collection. > ------------------------------------------------------------------------------ > Very nice initial work on Chinook. I'm pleased to see someone doing work > on this script. It's an important script historically in the Pacific > Northwest, but it's always seemed difficult to obtain information. You're telling me. I'm just lucky to have access to the Library of Congress, or I don't think I ever would have appreciated the volume of texts, hence the need for this encoding. Without that push, I don't think I would have had the heart to put this much effort into figuring things out. Fortunately, I have until June living within biking distance of the LoC, so I hope I can do all the necessary legwork by then. > (I am not CCing this to the Unicode list to avoid random discussion there.) I completely understand and even appreciate it. Really. > A few specific initial comments on your proposal. > > It might bear some explanation about why you chose circles in your font, > instead of the oblique ovals always (?) found in manuscripts. That is an issue I have mulled in the past, but there are two main reasons for my so called "normative" circle shapes for the circle vowels. First, all of the Kamloops Wawa texts were handwritten and copied via a mimeograph machine. I would guess that the independent letters were probably easiest/quickest to write in that somewhat oblong form, so that's what I have for example forms. In connected text, the letter shapes varied extensively from ovals to teardrops to egg shaped to a compressed circle, varying contextually and just plain different from week to week. I could show several examples of each of these just from the few scanned texts I do have on hand, and access to larger archives would bring in a great corpus of letter variation. The gist however, is that these letter shapes all skirt around an "idealized" circular form, whether something close to that form is really ever realized or not. Bear in mind, all the texts we have are handwritten, and if I understand the technology right, LeJeune was actually scraping wax. In the interest of full disclosure, I have to include my second reason: given the current typographic tools I have at my command (FontForge for Linux) and my historic tools (the more expressive High Logic Font Creator Program), circular glyph shapes were much easier to accomplish in a nice enough form that I was not embarrassed by the quality. That having been said, the images included with my draft proposal are not representations of the current font, which has a significantly lighter weight and a few minor variations, but maintain the pure circles of my included PNGs. I have also been endeavoring to accomplish a workable combining behavior in my font, for the use of the Chinook Jargon community at large, so I have a second reason for the glyph shapes: the rounded characters allow me to have one form of each circle vowel that will combine with any preceding or trailing consonant without the need of contextual variation (don't get me started on my travails coding for contextual orientation of I). > All of the rules look complicated, so it could probably bear more > explanation and diagrams. Especially the stuff about use of variation > selectors. It could use some examples with actual text fragments, to > determine whether it's justified and whether or not you really need two > VSes, etc. I would actually prefer a Combining Variation Selector over VS2 for the overlapping forms of letters - signifying initials of consonants, or W- vowels with O. Unfortunately, Combining Variation Selectors are not currently part of the Unicode Standard! I defer to the experts on the wisdom of including with the proposal a block of these in the BMP for the use of other scripts. I think they should be made available to minority languages with characters along the lines of Latin aesh and oethel - not strictly ligatures, but definitely combined letters of some sort. Whether they should be part of their own block of 16 like VS1-VS15, or just a couple characters appended to another block (I'm looking at FB08 and FB09 in Alphabetic Presentation Forms right now), I don't know. As for definite character variants, I have included a bit of explanatory material on why I believe Ng, Ch/J, and Ts/Z should be considered variants of N, Sh, and S. The variant forms appear quite regularly in the texts, but are never realized in their alternate forms in some situations. It also seems to somewhat common to have the standard form written when the word definitely has the variant pronunciation. While I have not found strictly contrastive use of E vs I, (there may be a perfect contrast in later years with "yet"= i+e+t and "eight"= e+i+t) a great deal of explanatory material on both divergent and convergent behavior is found within the Chinook rudiments text. Furthermore, the word initial distinction of E and I seems to be absolutely preserved throughout the texts, and while I've never seen the syllable final variant in running text before, I've not actually ever looked for it. In short, I/E acts like a single character that varies in definite contexts, and is invariant in other definite contexts. That sounds identical to the behavior of the varying Mongolian letters. The Wi/We distinction is essentially by analogue to I/E, but I do have an example of the different forms being used in the same sentence, though not a scan, just a citation. On the issue of explanations and diagrams, I have considered (you've probably pushed me over the line here) creating a page with nothing but each of the syllable combination rules laid out with all the examples I can find. In the end, I thought an example of each rule would at least give a glimpse at how the script combines letters, the brevity of which I thought more important than an all-out Q.E.D justification for each rule. That having been said, over the next week or so, I will try to upload a good hundred words or so, and include them as examples of the different rules. Realize that I have a curling competition next weekend, so I may not be that fast. > Most or all of the cross references in the names list (with arrows ->) > are not necessary. We usually only put in such references when the items > are confusable, or when someone looking for one may come up with the > other. E.g., for these script characters it's not necessary to cross > reference so heavily. I just found what I considered a graphically similar character and copied its cross references from my Unicode Standard version 5.0. I figured thoroughness was preferable to brevity - especially when it's just copying text. > You needn't specify a preferred location in the SMP, it's unlikely to be > moved from where it's now roadmapped, and this suggestion would serve to > confuse some people in some committees, causing excessive "churn". Also, > the 102xx row is occupied with *ancient* scripts and the empty space is > likely to be eventually roadmapped for more of those. Chinook is > currently grouped very tentatively with other "shorthand" scripts, > though we're not really sure if any of those would ever be encoded. Ok. I was under the impression that the roadmaps were pretty malleable - the pre-allocation for Chinook in the roadmap to the SMP has already changed from U+11D00 in SMP 5.0 to U+16C00 in 5.0.1. My hope was that it could be fronted as effectively as possible within a roadmap block that describes the script, and there is just a nice two column space between Carian and Old Italic. I figured that I might as well put all my cards on the table if I'm going to go through with this. On a related note, since you're a captive audience Michael, maybe you can shed some light on why there is that gap at 10200-1027F. I would have thought that Lycian and Carian would have gone there. Any insights into UTC here, or is it just an historical fluke? > At some point, we might be able to assist with finding more scans for > you, or connecting with L of C. I'm also CCing Deborah Anderson. Sorry, Deborah, I don't know who you are, but I'm sure I'll get to know you. > I hope the above comments give you some idea of areas to concentrate on > for the next phase of your work. > > Rick Thanks for all of your comments, insights, and thoughts, Rick. I really appreciate the time you have put in here, and I hope to do justice to your effort. Up to now, this has been my own private insanity, and I am grateful to have someone, or someones, with whom to share it. -Van Anderson PS, I am honored just to have Michael Everson Cc'd on an email for me. I fully display my geekiness here, but I absolutely consider this a celebrity encounter. > vanisaac@boil.afraid.org wrote: >> Ever since I saw Chinook on the SMP roadmap, I've been dabbling with a proposal for including the Chinook script. The link below is what I have right now (along with about 3MB of gif images on my harddrive), and I'm looking for criticism, help, praise, and thoughts as to the wisdom of proceeding. >> >> >> http://www.geocities.com/vanisaac/Chinook.html >> >> >> >> PS, If anyone has subscription access to the digital library Early Canadiana Online (canadiana.org), I would appreciate your assistance in obtaining some of their archived scans of the Wawa texts in the Chinook script. The Library of Congress has been annoyingly vague about the cost and possibility of obtaining scans of texts from their rare book collection. >> > > ------------------------------------------------------------------------------ Hello Van, Received your reply, thanks. We can see what others say and stay in contact. I may have more to say later, but for now... You asked: > maybe you can shed some light on why there is that gap at 10200-1027F. I would have thought that Lycian and Carian would have gone there. Any insights into UTC here, or is it just an historical fluke? I suspect historical fluke. The roadmap is hashed out between three of us: Michael, Ken, and myself, and it's fairly malleable, and UTC & WG2 generally stays away from it... because it could be a serious time sink if a committee got hold of it. Deborah Anderson is a researcher at UC Berkeley who runs the Script Encoding Initiative, among other things. She's been responsible for overseeing development work on most of the script additions to Unicode in the last 4 years. See: http://www.unicodeconference.org/bios.htm and http://linguistics.berkeley.edu/~dwanders/ Ken Whistler is a Technical Director at Unicode, Inc, managing editor of the standard itself and chair of the editorial committee... and you apparently already know Michael Everson. Ah, and I hope you win your curling match. My stepmother's parents were big curlers in northern BC. Cheers, Rick ------------------------------------------------------------------------------ Hi Van, I run a project to help get characters and scripts proposed, and get through the standards process. I'd be more than happy to answer questions you may have on the process. (I work closely with Rick and Ken, and Michael sometimes works for my project.) I am glad Rick sent you some comments. I am out of town and won't be able to review it until later this week, but I would be happy to send you my thoughts. I just tried to see if UC Berkeley has a subscription to Early Canadiana Online, but it apparently does not. One person who may be able to help is Chris Harvey at chris@languagegeek.com. He has done work on Canadian Aboriginal Syllabics. I'll write to him now and ask, cc'ing you. If he can't, I have another idea for a possible way to get the scans you wish, which I am following up on now. Thanks for your work on this! With best wishes, Debbie Anderson Project Leader, Script Encoding Initiative -----Original Message----- From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of vanisaac@boil.afraid.org Sent: Tuesday, February 10, 2009 8:35 PM To: unicode@unicode.org Subject: Draft proposal for inclusion of the Chinook script in Unicode Ever since I saw Chinook on the SMP roadmap, I've been dabbling with a proposal for including the Chinook script. The link below is what I have right now (along with about 3MB of gif images on my harddrive), and I'm looking for criticism, help, praise, and thoughts as to the wisdom of proceeding. http://www.geocities.com/vanisaac/Chinook.html PS, If anyone has subscription access to the digital library Early Canadiana Online (canadiana.org), I would appreciate your assistance in obtaining some of their archived scans of the Wawa texts in the Chinook script. The Library of Congress has been annoyingly vague about the cost and possibility of obtaining scans of texts from their rare book collection. ------------------------------------------------------------------------------ Hi Chris, I am currently out of town, but I wanted to follow up on a question posed by Van Anderson (no relation) regarding accessing early Canadiana materials: “If anyone has subscription access to the digital library Early Canadiana Online (canadiana.org), I would appreciate your assistance in obtaining some of their archived scans of the Wawa texts in the Chinook script. The Library of Congress has been annoyingly vague about the cost and possibility of obtaining scans of texts from their rare book collection.” Do you have access to Early Canadiana Online? I see UC Berkeley does not own a subscription. I may have another approach I could take, in case you aren’t able to access it. Van’s Chinook draft proposal, by the way, is located at: http://www.geocities.com/vanisaac/Chinook.html. Hope you (and family) are well, Thanks, Debbie ------------------------------------------------------------------------------ I am well aware of the work of the SEI, and I thank you for your assistance in acquiring these texts. Working for a non-profit, I unfortunately do not have the luxury of spending $400 on an individual subscription to a site I will use for a single purpose, even given the tremendous purpose it is for. Again, much thanks for your help and efforts. -Van Anderson -----Original Message----- > Hi Van, > I run a project to help get characters and scripts proposed, and get through the standards process. I'd be more than happy to answer questions you may have on the process. (I work closely with Rick and Ken, and Michael sometimes works for my project.) I am glad Rick sent you some comments. I am out of town and won't be able to review it until later this week, but I would be happy to send you my thoughts. > > I just tried to see if UC Berkeley has a subscription to Early Canadiana Online, but it apparently does not. One person who may be able to help is Chris Harvey at chris@languagegeek.com. He has done work on Canadian Aboriginal Syllabics. I'll write to him now and ask, cc'ing you. If he can't, I have another idea for a possible way to get the scans you wish, which I am following up on now. > > Thanks for your work on this! > > With best wishes, > Debbie Anderson > Project Leader, Script Encoding Initiative > > > -----Original Message----- > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of vanisaac@boil.afraid.org > Sent: Tuesday, February 10, 2009 8:35 PM > To: unicode@unicode.org > Subject: Draft proposal for inclusion of the Chinook script in Unicode > > Ever since I saw Chinook on the SMP roadmap, I've been dabbling with a proposal for including the Chinook script. The link below is what I have right now (along with about 3MB of gif images on my harddrive), and I'm looking for criticism, help, praise, and thoughts as to the wisdom of proceeding. > > > http://www.geocities.com/vanisaac/Chinook.html > > > > PS, If anyone has subscription access to the digital library Early Canadiana Online (canadiana.org), I would appreciate your assistance in obtaining some of their archived scans of the Wawa texts in the Chinook script. The Library of Congress has been annoyingly vague about the cost and possibility of obtaining scans of texts from their rare book collection. > > > ------------------------------------------------------------------------------ wrote: > Ever since I saw Chinook on the SMP roadmap, I've been dabbling with a > proposal for including the Chinook script. The link below is what I > have right now (along with about 3MB of gif images on my harddrive), > and I'm looking for criticism, help, praise, and thoughts as to the > wisdom of proceeding. > > http://www.geocities.com/vanisaac/Chinook.html > > PS, If anyone has subscription access to the digital library Early > Canadiana Online (canadiana.org), I would appreciate your assistance > in obtaining some of their archived scans of the Wawa texts in the > Chinook script. The Library of Congress has been annoyingly vague > about the cost and possibility of obtaining scans of texts from their > rare book collection. As a quick observation, you might want to avoid the heading "Normative Glyph Shapes" on one of your tables. AFAIK there is no such concept in Unicode, except perhaps for some classes of symbols. For those who are interested, the "Chinook and Shorthand Rudiments" primer from 1898 is available on Michael Everson's site: http://www.evertype.com/standards/iso10646/pdf/chinook-and-shorthand.pdf -- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ ------------------------------------------------------------------------------ The greatest expert I know on the wawa writing is Dave Robertson (ddr11@uvic.ca), who has copies of pretty much everything. ------------------------------------------------------------------------------ This is a list for discussion of the possible encoding of the Chinook Wawa in the UCS. If you would NOT like to be subscribed, please let me know.Welcome to the Chinook@evertype.com mailing list! To post to this list, send your email to: chinook@evertype.com General information about the mailing list is at: http://evertype.com/mailman/listinfo/chinook_evertype.com If you ever want to unsubscribe or change your options (eg, switch to or from digest mode, change your password, etc.), visit your subscription page at: http://evertype.com/mailman/options/chinook_evertype.com/vanisaac%40boil.afraid.org You can also make such adjustments via email by sending a message to: Chinook-request@evertype.com with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions. You must know your password to change your options (including changing the password, itself) or to unsubscribe. It is: ******** Normally, Mailman will remind you of your evertype.com mailing list passwords once every month, although you can disable this if you prefer. This reminder will also include instructions on how to unsubscribe or change your account options. There is also a button on your options page that will email your current password to you. ------------------------------------------------------------------------------ Hello! > “If anyone has subscription access to the digital library Early > Canadiana Online (canadiana.org), I would appreciate your assistance > in obtaining some of their archived scans of the Wawa texts in the > Chinook script. The Library of Congress has been annoyingly vague > about the cost and possibility of obtaining scans of texts from their > rare book collection.” Yes, I do have access to Canadiana Online. Are there any texts in particular that you need? Or I can do a search of all available documents. A variant of Duployé was also used for other languages in the area, like Thompson/Nłeʔkepmxcin. Those might come in handy as well, and might add a few more characters to the proposal. > Hope you (and family) are well, Thank-you, we’re doing great. Our best wishes go out to your family as well! Chris ------------------------------------------------------------------------------ Chris, thanks so much for your help. Fortunately, I have access to the Library of Congress rare book room, where I can read actual originals of these texts, so I have been able to identify several places that employ rare constructs in the Chinook script. However, the photoduplication service seems to work about as efficiently and helpfully as everything else in Washington, DC! Fortunately, the issues of the Kamloops Wawa that I would like to have scans of are all gathered at http://www.canadiana.org/ECO/ItemRecord/8_04645?id=58e68ecda9cb15cd and I would like to get images of the following issues or pages, in order of importance: issue No 101; No 77 pg 75; issue No 57; issue No 51; issue No 59; No 68 pg 40; issue No 50; Vol 3 No 1 pg 3; and No 45 pg 51, Please don't be daunted by the page numbers. They only started at pg 1 at the beginning of the year, not each issue. The PNG images should be about 35KB each, and there are probably around 20 total - 5 full issues and 4 individual pages. That's actually all I need. As for other adaptations of Duploye in use, if you can find some scans on Candiana, or can at least steer me in the right direction to finding information, I would be indebted (as if I weren't already). With much thanks, - Van Anderson -----Original Message----- > Hello! > >> “If anyone has subscription access to the digital library Early >> Canadiana Online (canadiana.org), I would appreciate your assistance >> in obtaining some of their archived scans of the Wawa texts in the >> Chinook script. The Library of Congress has been annoyingly vague >> about the cost and possibility of obtaining scans of texts from their >> rare book collection.” > > Yes, I do have access to Canadiana Online. Are there any texts in > particular that you need? Or I can do a search of all available > documents. A variant of Duployé was also used for other languages in > the area, like Thompson/Nłeʔkepmxcin. Those might come in handy as > well, and might add a few more characters to the proposal. > >> Hope you (and family) are well, > > Thank-you, we’re doing great. Our best wishes go out to your family as > well! > > Chris ------------------------------------------------------------------------------ I thought I should post my response to Rick McGowan's initial reply, so eve ryone can catch up. -----Original Message----- > Very nice initial work on Chinook. I'm pleased to see someone doing work > on this script. It's an important script historically in the Pacific > Northwest, but it's always seemed difficult to obtain information. You're telling me. I'm just lucky to have access to the Library of Congress , or I don't think I ever would have appreciated the volume of texts, hence the need for this encoding. Without that push, I don't think I would have had the heart to put this much effort into figuring things out. Fortunately , I have until June living within biking distance of the LoC, so I hope I c an do all the necessary legwork by then. > (I am not CCing this to the Unicode list to avoid random discussion there .) I completely understand and even appreciate it. Really. > A few specific initial comments on your proposal. > > It might bear some explanation about why you chose circles in your font, > instead of the oblique ovals always (?) found in manuscripts. That is an issue I have mulled in the past, but there are two main reasons for my so called "normative" circle shapes for the circle vowels. First, all of the Kamloops Wawa texts were handwritten and copied via a mim eograph machine. I would guess that the independent letters were probably e asiest/quickest to write in that somewhat oblong form, so that's what I hav e for example forms. In connected text, the letter shapes varied extensivel y from ovals to teardrops to egg shaped to a compressed circle, varying con textually and just plain different from week to week. I could show several examples of each of these just from the few scanned texts I do have on hand , and access to larger archives would bring in a great corpus of letter var iation. The gist however, is that these letter shapes all skirt around an " idealized" circular form, whether something close to that form is really ev er realized or not. Bear in mind, all the texts we have are handwritten, an d if I understand the technology right, LeJeune was actually scraping wax. In the interest of full disclosure, I have to include my second reason: giv en the current typographic tools I have at my command (FontForge for Linux) and my historic tools (the more expressive High Logic Font Creator Program ), circular glyph shapes were much easier to accomplish in a nice enough fo rm that I was not embarrassed by the quality. That having been said, the im ages included with my draft proposal are not representations of the current font, which has a significantly lighter weight and a few minor variations, but maintain the pure circles of my included PNGs. I have also been endeav oring to accomplish a workable combining behavior in my font, for the use o f the Chinook Jargon community at large, so I have a second reason for the glyph shapes: the rounded characters allow me to have one form of each circ le vowel that will combine with any preceding or trailing consonant without the need of contextual variation (don't get me started on my travails codi ng for contextual orientation of I). > All of the rules look complicated, so it could probably bear more > explanation and diagrams. Especially the stuff about use of variation > selectors. It could use some examples with actual text fragments, to > determine whether it's justified and whether or not you really need two > VSes, etc. I would actually prefer a Combining Variation Selector over VS2 for the ove rlapping forms of letters - signifying initials of consonants, or W- vowels with O. Unfortunately, Combining Variation Selectors are not currently par t of the Unicode Standard! I defer to the experts on the wisdom of includin g with the proposal a block of these in the BMP for the use of other script s. I think they should be made available to minority languages with charact ers along the lines of Latin aesh and oethel - not strictly ligatures, but definitely combined letters of some sort. Whether they should be part of th eir own block of 16 like VS1-VS15, or just a couple characters appended to another block (I'm looking at FB08 and FB09 in Alphabetic Presentation Form s right now), I don't know. As for definite character variants, I have included a bit of explanatory ma terial on why I believe Ng, Ch/J, and Ts/Z should be considered variants of N, Sh, and S. The variant forms appear quite regularly in the texts, but a re never realized in their alternate forms in some situations. It also seem s to somewhat common to have the standard form written when the word defini tely has the variant pronunciation. While I have not found strictly contrastive use of E vs I, (there may be a perfect contrast in later years with "yet"= i+e+t and "eight"= e+i+t) a great deal of explanatory material on both divergent and convergent behavi or is found within the Chinook rudiments text. Furthermore, the word initia l distinction of E and I seems to be absolutely preserved throughout the te xts, and while I've never seen the syllable final variant in running text b efore, I've not actually ever looked for it. In short, I/E acts like a sing le character that varies in definite contexts, and is invariant in other de finite contexts. That sounds identical to the behavior of the varying Mongo lian letters. The Wi/We distinction is essentially by analogue to I/E, but I do have an example of the different forms being used in the same sentence , though not a scan, just a citation. On the issue of explanations and diagrams, I have considered (you've probab ly pushed me over the line here) creating a page with nothing but each of t he syllable combination rules laid out with all the examples I can find. In the end, I thought an example of each rule would at least give a glimpse a t how the script combines letters, the brevity of which I thought more impo rtant than an all-out Q.E.D justification for each rule. That having been s aid, over the next week or so, I will try to upload a good hundred words or so, and include them as examples of the different rules. Realize that I ha ve a curling competition next weekend, so I may not be that fast. > Most or all of the cross references in the names list (with arrows ->) > are not necessary. We usually only put in such references when the items > are confusable, or when someone looking for one may come up with the > other. E.g., for these script characters it's not necessary to cross > reference so heavily. I just found what I considered a graphically similar character and copied i ts cross references from my Unicode Standard version 5.0. I figured thoroug hness was preferable to brevity - especially when it's just copying text. > You needn't specify a preferred location in the SMP, it's unlikely to be > moved from where it's now roadmapped, and this suggestion would serve to > confuse some people in some committees, causing excessive "churn". Also, > the 102xx row is occupied with *ancient* scripts and the empty space is > likely to be eventually roadmapped for more of those. Chinook is > currently grouped very tentatively with other "shorthand" scripts, > though we're not really sure if any of those would ever be encoded. Ok. I was under the impression that the roadmaps were pretty malleable - th e pre-allocation for Chinook in the roadmap to the SMP has already changed from U+11D00 in SMP 5.0 to U+16C00 in 5.0.1. My hope was that it could be f ronted as effectively as possible within a roadmap block that describes the script, and there is just a nice two column space between Carian and Old I talic. I figured that I might as well put all my cards on the table if I'm going to go through with this. On a related note, since you're a captive audience Michael, maybe you can s hed some light on why there is that gap at 10200-1027F. I would have though t that Lycian and Carian would have gone there. Any insights into UTC here, or is it just an historical fluke? > At some point, we might be able to assist with finding more scans for > you, or connecting with L of C. I'm also CCing Deborah Anderson. Sorry, Deborah, I don't know who you are, but I'm sure I'll get to know you . > I hope the above comments give you some idea of areas to concentrate on > for the next phase of your work. > > Rick Thanks for all of your comments, insights, and thoughts, Rick. I really app reciate the time you have put in here, and I hope to do justice to your eff ort. Up to now, this has been my own private insanity, and I am grateful to have someone, or someones, with whom to share it. -Van Anderson PS, I am honored just to have Michael Everson Cc'd on an email for me. I fu lly display my geekiness here, but I absolutely consider this a celebrity e ncounter. > vanisaac@boil.afraid.org wrote: >> Ever since I saw Chinook on the SMP roadmap, I've been dabbling with a p roposal for including the Chinook script. The link below is what I have rig ht now (along with about 3MB of gif images on my harddrive), and I'm lookin g for criticism, help, praise, and thoughts as to the wisdom of proceeding. >> >> >> http://www.geocities.com/vanisaac/Chinook.html >> >> >> >> PS, If anyone has subscription access to the digital library Early Canad iana Online (canadiana.org), I would appreciate your assistance in obtainin g some of their archived scans of the Wawa texts in the Chinook script. The Library of Congress has been annoyingly vague about the cost and possibili ty of obtaining scans of texts from their rare book collection. ------------------------------------------------------------------------------ Van said: > I would actually prefer a Combining Variation Selector over VS2 for the ove > rlapping forms of letters - signifying initials of consonants, or W- vowels > with O. Unfortunately, Combining Variation Selectors are not currently par > t of the Unicode Standard! Just so we're all on the same page here, Variation Selectors *are* combining characters in the Unicode Standard. (gc=Mn) The constraint is that Variation Sequences can only be defined for a base character + VSx sequence. For architectural reasons related to canonical equivalence it doesn't make sense to try to use Variation Selectors to define variations in the form of *other* combining marks, so the standard disallows the definition of Variation Sequences consisting of combining mark + VSx. The questions for Chinook Wawa (as for encoding of any script) are: Are there instances of X's and Y's where X and Y are graphically distinct in ways that people wish to maintain systematic presentation distinctions, but where {X, Y} as a set are still conceived of as all constituting the same underlying abstract character, so that encoding X and Y as separate characters would make representation of the text content more problematical than encoding only one. And if that is the case, can the graphical distinction between X and Y be handled soley by fonts and/or by textual markup (which will certainly be needed for other aspects of paleography, anyway), or is there a strong case to be made that *despite* the character identity, the graphical distinctions between X and Y need to be representable in plain text? Finally, are X and Y base character forms or not? Only when all of those questions are answered in a certain way does it make any sense to proceed with proposing Variation Sequences for a script on top of the encoding of the character repertoire itself. That is why almost all scripts in the Unicode Standard have no Variation Sequences defined for them. Han is the major exception, for a bunch of good reasons. Variation Sequences, IMO, should be a matter of *last* resort, and should not be the starting point for discussion of how to encode Chinook (or any other script for that matter). --Ken ------------------------------------------------------------------------------ Ken offered some solid info & reasoning... This is why I'd like to see more details and examples of the stuff for with Van had thought to use VSes. Rick ------------------------------------------------------------------------------ Just FYI for people on this list. Debbie Anderson will probably be out of contact until next week. I forgot I'd received a note to that effect from her on the 9th. Rick ------------------------------------------------------------------------------ -----Original Message----- > Van said: > >> I would actually prefer a Combining Variation Selector over VS2 for the ove >> rlapping forms of letters - signifying initials of consonants, or W- vow els >> with O. Unfortunately, Combining Variation Selectors are not currently par >> t of the Unicode Standard! > > Just so we're all on the same page here, Variation Selectors > *are* combining characters in the Unicode Standard. > (gc=Mn) I'm sorry, Ken. There's some bad terminology that I will have to expunge in order to make myself clear. My preference to using VS2 in almost all circu mstances here is to find some way of encoding an alternate combining behavi or for Chinook letters. My instinct is to have a small set of control chara cters that indicate alternate combining behavior of the two adjacent charac ters. That is what I meant by a Combining Variation Selector. Perhaps Conjo ining Selector is a less ambiguous term. So on to the behavior of VS2 or the Conjoining Selector, or whatever: Line Consonants + VS2 -or- Arc Consonants + VS2 VS2 encodes abbreviations and initialisms, the most common being S + T (Sag hali Tyee, or God), Sh + K (Jesu Kri, or Jesus Christ), and IT + S (etc.). These initialism / abbreviation forms are partially overlapping. And, O & W vowels + VS2 VS2 encodes compound vowel formation. The two, by far, most common compound vowels are given their own code points, Wa @ U+x1A and Wi @ U+x1B. Again, this is an overlapping behavior of logically adjacent letters in the Chinoo k script, this time the initial circular character completely surrounds the following character. > The constraint is that Variation Sequences can only be > defined for a base character + VSx sequence. For architectural > reasons related to canonical equivalence it doesn't make > sense to try to use Variation Selectors to define variations > in the form of *other* combining marks, so the standard > disallows the definition of Variation Sequences consisting > of combining mark + VSx. There are no definitions consisting of a combining mark + VSx. The only pro posed characters are independent letters. The fact that these letters engag e in complex cursive/combining behaviors does not negate their status as in dependent letters. > The questions for Chinook Wawa (as for encoding of any script) > are: Are there instances of X's and Y's where X and Y are > graphically distinct in ways that people wish to maintain > systematic presentation distinctions, but where {X, Y} as > a set are still conceived of as all constituting the same > underlying abstract character, so that encoding X and Y > as separate characters would make representation of the > text content more problematical than encoding only one. > And if that is the case, can the graphical distinction > between X and Y be handled soley by fonts and/or by > textual markup (which will certainly be needed for other > aspects of paleography, anyway), or is there a strong case > to be made that *despite* the character identity, the > graphical distinctions between X and Y need to be > representable in plain text? Finally, are X and Y base > character forms or not? Only when all of those questions > are answered in a certain way does it make any sense to > proceed with proposing Variation Sequences for a script > on top of the encoding of the character repertoire itself. So if I read this right, we seem to have 4 distinct conditions necessary fo r a Variation Selector: 1) X and Y are graphically distinct and need to rem ain distinct. 2) X and Y are conceived of as the same abstract character. 3 ) X/Y distinctions should be indicated in raw text, rather than by markup. 4) I don't know what you mean by "base character forms" so I can't really a nswer this intelligibly. To answer these questions (as best I can) for the uses of VS1 on Arc Conson ants and I: The forms of Ng, Ch/J, and Ts/Z in Chinook: 1) are graphically distinct, each having a single dot contained within the curve of the letters N, Sh, and S. 2) are used as if they are the same abstract character, there being no mode rn writers to conceive one way or another - the variant forms do not unifor mly appear where they should, indicating equivalence in the composer's mind , and never appear when used for abbreviations and initialisms. 3) have different forms representing phonetic distinctions from their "base " character. These phonemes (hence forms) do not have allophonic distributi on in the language, and the alternate forms cannot be predicted by an algor ithm short of a spell-checker (is that redundant?). The text may be legible to those knowledgeable of the language, but it is not complete. My feeling is that it is akin to writing Icelandic with "P" and "p" because "thorn" l ooks so similar. 4) This is my best guess as to the meaning of condition 4. In normal, flowi ng text, the forms Ng, Ch/J, and Ts/Z all behave exactly as an Arc Consonan t. In initials / abbreviations, they take a form identical to N, Sh, and S in that context: as an unaltered N, Sh, or S letterform. The form of E in Chinook: 1) is graphically distinct, being rotated or mirrored from the normal orien tation of the base character, "I". They are distinct in isolated context an d most initial and final contexts. 2) is not even listed in the Chinook Rudiments text as a separate character from "I". 3) has a different orientation, representing phonetic distinction from "I". The phonemes (and hence forms) do not have allophonic distribution, and th e alternate forms cannot be determined by algorithm. The text probably rema ins fairly legible in most circumstances, but it seems to be enough of a co ncern that the script provides recourse for the loss of the variation in me dial and other non-distinctive circumstances; through the use of modifying (diacritical?) dots. 4) Both forms, I and E, act identically in all text. They combine with the same letters and have identical forms in many positions in a syllable. They do, however, maintain distinct forms when possible. The "I" form seems to be the standard citation form for the letter, however. Currently, I do not have enough actual text available to answer these quest ions in regards to "We". > That is why almost all scripts in the Unicode Standard > have no Variation Sequences defined for them. Han is > the major exception, for a bunch of good reasons. I see Chinook as being more like Mongolian. > Variation Sequences, IMO, should be a matter of *last* > resort, and should not be the starting point for > discussion of how to encode Chinook (or any other script > for that matter). I actually agree with you wholeheartedly on that point. It is unfortunately getting late, and I don't want to send drivel, so I will end with the prom ise that I will revisit this issue over the weekend. > --Ken > > -Van ------------------------------------------------------------------------------ I am on the same page with Ken that variation selectors should be used only as a last resort, where the glyphs belong to the same abstract character but the graphical distinction needs to be maintained in plain text. Even if there is evidence that the dotted forms are used here and the undotted forms are used there, to represent the same sound, to me that does not make a case for variation selectors. Perhaps the patterns of use of U+05C1 HEBREW POINT SHIN DOT within Hebrew text might provide some insight. Or this could simply be an orthographic convention, not expected to be captured in a script encoding. -- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ ------------------------------------------------------------------------------ Hi Van, I had remembered that someone received a Dictionary Society of North America award for work on Chinook lexicography and I discovered it was Dave Robertson, whom Bill Poser recommended below. If you have difficulty reaching Dave, let me know. I know a few people at UVic (including two staff members in Humanities Computing with a keen interest in Unicode and who could perhaps assist with scanning materials [unsure, actually, if they can, but I throw this out as a possibility]). Debbie Link to short article about Dave Robertson and meeting on Chinook held in 2005: http://web.uvic.ca/gradstudies/greats/pdf/2005/robertson.pdf ------------------------------------------------------------------------------ > I am on the same page with Ken that variation selectors should be used > only as a last resort, where the glyphs belong to the same abstract > character but the graphical distinction needs to be maintained in plain > text. I am also wary of the gratuitous use of the Variation Selectors. I think th ey are an all too easy means of wantonly encoding stylistic, rather than se mantic information in text. That said, I very early on thought that variati on selectors were an approach to the several problems these characters repr esented, without causing undue harm elsewhere. Early on, I actually encoded Chinook Variation Selectors (on the model of Mongolian Free Variation Sele ctors) because I didn't have version 5.0 and didn't know about universal va riation selectors. Once I found them and used them for every problem I enco untered, I started looking at reducing their use. The use of algorithmic sy llable composition and ZWJ and ZWNJ to override the algorithm is a direct r esult of this process. My desire to find some - any - way of encoding the o verlapping forms of letters is also part of this process. So I think we have two questions at issue: do the alternate forms constitut e the same abstract character, and should the distinction be maintained in plain text. I am unprepared to discuss Wi/We without further delving into t he texts, and I cannot access the texts until I get some scans, or Tuesday morning at the Library. That said, the other uses of VS1 are fair game here . I do not think the question is whether Ng, Ch/J, Ts/Z, and E should be enco ded in plain text. The fact that the secondary forms have a distinct phonet ic value, differing from the standard pronunciation of the character; that where possible, it is generally written in the secondary form, not as the b ase character; and that the alternate phoneme cannot be determined from con text, seem to me sound reasoning for it to be encodable in plain text. The alternate forms have semantic value to the point that one of them - E, has an alternate means of identification when its form is convergent with the b ase character, and the distinction must be maintained; and the others diffe r only in a context where the reader's knowledge of the language must be so mature, that they can supply not just the missing information for the one character, but for all but one letter of an entire word! On to whether they should constitute a separate character. They clearly mer ge graphically with N, Sh, S, and I in situations where the secondary forms are not permitted - either conflicting with an overlapping character, or i n an orientation unable to exit into the cursively following character. To encode them as separate characters would require implementers to divine tha t they have a contextual form identical to another character. I cannot say which is more desirable from a standards standpoint. I think (emphasis on t hat word) that Variation Selectors provide a simple, elegant method of main taining the distinct forms, while representing their convergent relationshi p. That having been said, I am not one of the people responsible for the long- term viability of the Unicode Standard, and I do not know the history of, b attles over, or politics concerning the Variation Selectors. More important ly, if we agree that these letters should be represented in plain text, is there another means of encoding them? I didn't submit this draft to have ev eryone tell me how right I was, I sent it out so people would tell me where the problems were, and to help me resolve them. Obviously, there is a lot of skepticism concerning the use of Variation Selectors, and I guess I am m ost confused about what the concerns are with using them, and whether this idea for their use is opening up a can of worms for the keepers of the stan dard. > Even if there is evidence that the dotted forms are used here and the > undotted forms are used there, to represent the same sound, to me that > does not make a case for variation selectors. What would make the case then? I hate the term "make the case", because I'm not trying to convince you that this is the right way, I am presenting it as the way I figured out to tackle the problems of a baffling script behavi or, and have it work consistently for most of the complexities of text hand ling in the script. I'm not trying to prescribe a solution, I'm trying to d escribe a problem. If that solution is going to cause more problems, it nee ds to be replaced. > Perhaps the patterns of > use of U+05C1 HEBREW POINT SHIN DOT within Hebrew text might provide > some insight. So do you think the solution to the affricate/velar forms Ch/J, Ts/Z, and N g is to encode a Combining Middle Dot in the Chinook block? > Or this could simply be an orthographic convention, not > expected to be captured in a script encoding. Correct me if I'm wrong, but wouldn't "orthographic convention" imply that the different forms absolutely do not convey phonetic value, that this is a mere stylistic distinction? When I think of orthographic convention, I thi nk ligatures and strange shapes in Byzantine Greek manuscripts. Is there a nuance I'm not understanding, am I completely mistaken, or is that basicall y right? > -- > Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 > http://www.ewellic.org > http://www1.ietf.org/html.charters/ltru-charter.html > http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ I will write up version 3.2 this weekend and see how to incorporate some of these suggestions. I think I've been won over by the Combining Middle Dot idea, and one of my more vacuous ideas - the use of H to modify vowels - wi ll be replaced with the use of standard combining characters. I will also t ake out any suggestions on the proposed location for the allocation. -Van ------------------------------------------------------------------------------ wrote: > I do not think the question is whether Ng, Ch/J, Ts/Z, and E should be > encoded in plain text. The fact that the secondary forms have a > distinct phonetic value, differing from the standard pronunciation of > the character; that where possible, it is generally written in the > secondary form, not as the base character; and that the alternate > phoneme cannot be determined from context, seem to me sound reasoning > for it to be encodable in plain text. I agree. These are clearly different letters in the script and need to be representable in plain text. The question is how to translate "letters in the script" to "characters in the encoding." > On to whether they should constitute a separate character. They > clearly merge graphically with N, Sh, S, and I in situations where the > secondary forms are not permitted - either conflicting with an > overlapping character, or in an orientation unable to exit into the > cursively following character. This was what I meant by an "orthographic convention," though that may not have been the best choice of terms. Glyph A is normally used, but in certain contexts having to do with ease of reading or writing, closely-related glyph B is used instead. I suspect all shorthand systems have such issues; I'm pretty sure Gregg does. > To encode them as separate characters would require implementers to > divine that they have a contextual form identical to another > character. I don't understand why that would be required. That doesn't mean it isn't, only that I don't understand. > I cannot say which is more desirable from a standards standpoint. I > think (emphasis on that word) that Variation Selectors provide a > simple, elegant method of maintaining the distinct forms, while > representing their convergent relationship. It should not be necessary to use VS to write ordinary text in a script. > That having been said, I am not one of the people responsible for the > long-term viability of the Unicode Standard, and I do not know the > history of, battles over, or politics concerning the Variation > Selectors. More importantly, if we agree that these letters should be > represented in plain text, is there another means of encoding them? I > didn't submit this draft to have everyone tell me how right I was, I > sent it out so people would tell me where the problems were, and to > help me resolve them. Obviously, there is a lot of skepticism > concerning the use of Variation Selectors, and I guess I am most > confused about what the concerns are with using them, and whether this > idea for their use is opening up a can of worms for the keepers of the > standard. Variation selectors are hints to the rendering engine to render the character differently from normal. They are metadata, and the Unicode Standard generally frowns on inserting this kind of metadata into plain text. The encoding model I am thinking of is something like SOFT HYPHEN, a character which sometimes appears as a visible glyph and sometimes not, depending on contextual layout. The example might not be good because so few rendering engines actually implement SOFT HYPHEN, but this is the model. Or, simply encode a modifier dot for the consonants and a 90° rotated "I" character for the "E", and directly encode the one that context demands. Or encode everything as precomposed letters. It's not a requirement that every glyph form in the Unicode/10646 proposal has to appear in the historical printed charts, which of course were not designed with digital character encoding needs in mind. Ask Michael Everson about this; many of his proposals over the past 10 to 15 years have included separately encoded variant forms of the same letter. > What would make the case then? I hate the term "make the case", > because I'm not trying to convince you that this is the right way, I > am presenting it as the way I figured out to tackle the problems of a > baffling script behavior, and have it work consistently for most of > the complexities of text handling in the script. I'm not trying to > prescribe a solution, I'm trying to describe a problem. If that > solution is going to cause more problems, it needs to be replaced. I agree that the existence of variant glyph forms, and the situations under which one is chosen over the other, may be baffling. Such is the nature of shorthand systems. I don't know whether this is a problem that needs to be solved in this way in the encoding. I'm looking forward to hearing more from Ken and other Unicode decision-makers on the relative merits of VS versus other solutions. > So do you think the solution to the affricate/velar forms Ch/J, Ts/Z, > and Ng is to encode a Combining Middle Dot in the Chinook block? Maybe. Not necessarily a "middle dot" as such (Ken can tell you how many of those we already have) but at least a modifier dot. Maybe U+00B7 could be used directly. >> Or this could simply be an orthographic convention, not expected to >> be captured in a script encoding. > > Correct me if I'm wrong, but wouldn't "orthographic convention" imply > that the different forms absolutely do not convey phonetic value, that > this is a mere stylistic distinction? When I think of orthographic > convention, I think ligatures and strange shapes in Byzantine Greek > manuscripts. Is there a nuance I'm not understanding, am I completely > mistaken, or is that basically right? See above. By "orthographic convention" I specifically meant using one glyph to write the spelled-out version of "Jesu" and a different glyph to write the "J" as an unvoiced "Sh" in the abbreviated form. > I will write up version 3.2 this weekend and see how to incorporate > some of these suggestions. I think I've been won over by the Combining > Middle Dot idea, and one of my more vacuous ideas - the use of H to > modify vowels - will be replaced with the use of standard combining > characters. I will also take out any suggestions on the proposed > location for the allocation. Looking forward to seeing it. -- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ ------------------------------------------------------------------------------ Hi Van, I've meant to take a closer look at your proposal since you sent a note last week. I've been busy and now sick, but here's a brief response. I didn't realize Canadiana.org charged access fees! But you ought to be able to ILL microfiches of some of Kamloops Wawa and allied publications by Le Jeune. To track down everything you'd have to make a number of field trips to specialized repositories. There's not only that material but several hundred texts I've discovered written by First Nations people in this "shorthand" script. (Which by the way they called _Chinuk pipa_ "Chinook writing".) I'm analyzing the script and these people's variety of Chinook Jargon [CJ] in my forthcoming dissertation. The great quantity of this material is why I estimate that a majority of CJ is recorded in this script. Virtually none of it has previously been analyzed in the literature. What's more, the First Nations-written texts are the only extensive corpus of CJ in actual use--as opposed to the dozens of prescriptive-leaning and often mutually plagiarizing CJ publications in Roman script. This is an underdocumented language, despite common wisdom e.g. among scholars of pidgins that this is one of the better-documented pidgins. (Which also is true.) Moreover, quite a lot of the earliest and most extensive materials in 8 Salish languages, some written by native speakers, exists in the form of this shorthand. These too have gone unanalyzed by linguists, largely I believe due to the writing system and its non-portability. All this helps show why encoding _Chinuk pipa_ is a desirable goal, and why I thank Van for his great efforts. There's a good deal of interest in the CJ shorthand in particular and in CJ in general, but sharing of these texts, discussion of them in scholarly research, and creation of learning materials have been forced into the unsatisfactory workaround of nonce tranliterations. I have certain specific reactions as well. A brief list off the top of my cold-addled head: *In practice, not all (only about 29) of the shorthand "letters" which Le Jeune presents in charts of his alphabet were actually used. **Nasal-vowel characters had no use in writing CJ or Salish languages; they're vanishingly rare in Le Jeune's own handwriting and I think I may have seen a total of one occurrence in a First Nations-written text. **The supposed distinction between and isn't consistently borne out in actual texts -- essentially never in First Nations texts. **Very few of the diphthongal, triphthongal etc. letters based on the round letters and are actually used. The only common ones are , , . Pretty rare is (I think Van has it as , following Le Jeune's romanization), which occurs in just one word, a toponym , i.e. Quaaout Indian Reserve in BC, Canada. However this reserve is mentioned a fair amount in the preserved texts, so that a character is needed. *In various implementations of the shorthand, diacritics which Le Jeune never overtly explained or mentioned anywhere were used. Any attempt at encoding every _Chinuk pipa_ character would have to add acute accent marks (for Okanagan Salish), macrons (for "Stalo" Salish), breves, and a character to approximate the voiceless rounded velar fricative /xW/. A thankless task! *The de facto standard which partially resolves some of you guys' questions was Le Jeune's handwriting as seen in Kamloops Wawa: **In practice most people most of the time, at a rate far exceeding chance, oriented the vowels in a given CJ word precisely as Le Jeune habitually did. (See next point as well.) **In practice most people most of the time segmented shorthand words into syllables. In fact, there was remarkable unanimity in assigning the syllable breaks. As a result I've thought that the sanest solution to encoding _Chinuk pipa_ might be to create syllable glyphs. (For CJ, with its small vocabulary, I guess this might mean 100-200 glyphs.) Sigh: this would I'm sure only reinforce a longstanding misconception that the shorthand was a syllabary. Enough for now, I must rest. With high hopes that this encoding will be successful, --Dave Robertson U. of Victoria ------------------------------------------------------------------------------ > Hi Van, I've meant to take a closer look at your proposal since you sent a > note last week. I've been busy and now sick, but here's a brief response . I'm sorry you've not been well. I was going to email you directly if you ha dn't responded today. > I didn't realize Canadiana.org charged access fees! But you ought to be > able to ILL microfiches of some of Kamloops Wawa and allied publications > by Le Jeune. To track down everything you'd have to make a number of > field trips to specialized repositories. Actually, Deborah Anderson and UC Berkeley's Script Encoding Initiative has gotten me in touch with someone who has a canadiana subscription. I, of co urse, have downloaded the scans of all the Wawa texts available for free, i ncluding a hymnal, a Chinook Rudiments text, etc. but I am looking forward to having home access. > There's not only that material but several hundred texts I've discovered > written by First Nations people in this "shorthand" script. (Which by the > way they called _Chinuk pipa_ "Chinook writing".) I'm analyzing the > script and these people's variety of Chinook Jargon [CJ] in my forthcoming > dissertation. I guess the script name is probably important. I pretty much adopted the na me from the roadmap (Dave, see http://unicode.org/roadmaps/smp/ ) but never really gave it any thought. > The great quantity of this material is why I estimate that a majority of > CJ is recorded in this script. Virtually none of it has previously been > analyzed in the literature. What's more, the First Nations-written texts > are the only extensive corpus of CJ in actual use--as opposed to the > dozens of prescriptive-leaning and often mutually plagiarizing CJ > publications in Roman script. This is an underdocumented language, > despite common wisdom e.g. among scholars of pidgins that this is one of > the better-documented pidgins. (Which also is true.) That was what motivated me as well. I saw three decades worth of the Kamloo ps Wawa at the Library of Congress, and basically told myself that somethin g had to be done. > Moreover, quite a lot of the earliest and most extensive materials in 8 > Salish languages, some written by native speakers, exists in the form of > this shorthand. These too have gone unanalyzed by linguists, largely I > believe due to the writing system and its non-portability. This is where I REALLY need your help, Dave. I need to know if there is any thing odd going on in these languages' use of the Wawa writing. Do they hav e any non-standard characters? Do they have dots signifying glottalized con sonants? My only formal native language training is in Nez Perce, so are th ere any weird sounds in the Salish languages not in Sahaptians and Chinook that were symbolized ad hoc? > All this helps show why encoding _Chinuk pipa_ is a desirable goal, and > why I thank Van for his great efforts. That means a lot, coming from you. > There's a good deal of interest in the CJ shorthand in particular and in > CJ in general, but sharing of these texts, discussion of them in scholarly > research, and creation of learning materials have been forced into the > unsatisfactory workaround of nonce tranliterations. > > I have certain specific reactions as well. A brief list off the top of my > cold-addled head: > > *In practice, not all (only about 29) of the shorthand "letters" which Le > Jeune presents in charts of his alphabet were actually used. > > **Nasal-vowel characters had no use in writing CJ or Salish > languages; they're vanishingly rare in Le Jeune's own handwriting and > I think I may have seen a total of one occurrence in a First > Nations-written text. LeJeune seems to have only used them for English loan words, and I don't remember seeing them in any of the actual texts, only in the Chinook Rudiment s lexicon. However, they do exist, and they definitely do not constitute presentation forms of other characters, so they've gotta be included. > **The supposed distinction between and isn't consistently > borne out in actual texts -- essentially never in First Nations > texts. Which is why I'm having such trouble figuring out the proper encoding. > **Very few of the diphthongal, triphthongal etc. letters based on th e > round letters and are actually used. The only common ones > are , , . Pretty rare is (I think Van has it as > , following Le Jeune's romanization), which occurs in just one > word, a toponym , i.e. Quaaout Indian Reserve in BC, Canada. > However this reserve is mentioned a fair amount in the preserved > texts, so that a character is needed. My instinct is to treat these as composed forms. The problem I have is that by doing so, I am creating a Unicode conflict because Wa and Wi are encoded as separate characters, but they are theoretically canonically decomposable under NFD and NFKD, and these characters are not included as legacy encodings. That having been said, Wa is actually the second member of the composable compound vowel "OhWa", and Wi is the initial member of the composable compound vowels "Weyi" and "Wia". Not including them precomposed presents us with our only instances the triply compounded composed vowels in the Chinook Script. Admittedly, I know that none of these compound composable vowels are common, but that's kind of the point. We include the common characters and provide representation for the rare. > *In various implementations of the shorthand, diacritics which Le Jeune > never overtly explained or mentioned anywhere were used. Any attempt at > encoding every _Chinuk pipa_ character would have to add acute accent > marks (for Okanagan Salish), macrons (for "Stalo" Salish), breves, and a > character to approximate the voiceless rounded velar fricative /xW/. A > thankless task! Two things: first, I have already made arrangements to use standard combining diacritics for the I/E dots, and documenting the use of combining acute, macron, and breve will take me about two minutes once I see how they operate. Second, I absolutely NEED to know the form of xW, so I can include it in the repertoire. > *The de facto standard which partially resolves some of you guys' > questions was Le Jeune's handwriting as seen in Kamloops Wawa: > > **In practice most people most of the time, at a rate far exceeding > chance, oriented the vowels in a given CJ word precisely as Le Jeune > habitually did. (See next point as well.) > > **In practice most people most of the time segmented shorthand words > into syllables. In fact, there was remarkable unanimity in assigning > the syllable breaks. As a result I've thought that the sanest > solution to encoding _Chinuk pipa_ might be to create syllable > glyphs. (For CJ, with its small vocabulary, I guess this might mean > 100-200 glyphs.) Sigh: this would I'm sure only reinforce a > longstanding misconception that the shorthand was a syllabary. I think there are three problems with encoding the syllables directly. First, it is not only unnecessary, given that syllables are joined combinations of letters, it wastes a large amount of allocation space for a script that is quite rare. Second, with I and sometimes U not forming independent syllables, the number soon spirals out of control and leaves absolutely no room for non-standard or future anomalous syllabic breaking. Third, it sets a horrible precedent for any future shorthand-derived writing systems that have similar syllabic breaks. > Enough for now, I must rest. With high hopes that this encoding will be > successful, I'm coming down with a bug, too. I'm glad to have your input and oversight, Dave, and I hope you get better. -Van > --Dave Robertson > U. of Victoria > > David D. Robertson > PhD candidate > Department of Linguistics > University of Victoria > PO Box 3045 > Victoria, BC V8W 3P4 Canada > (250) 472-4579 phone > (250) 472-5300 fax Van Anderson Construction Superintendent DC Habitat for Humanity Washington, DC vanisaac@hotmail.com vanisaac@boil.afraid.org 202.680.3513 ------------------------------------------------------------------------------ Hi again Van, a couple even briefer responses: [Dave] >> ...quite a lot of the earliest and most extensive materials in 8 > >> Salish languages, some written by native speakers, exists in the form of > >> this shorthand. These too have gone unanalyzed by linguists... > > > [Van] > This is where I REALLY need your help, Dave. I need to know if there is > any > thing odd going on in these languages' use of the Wawa writing. Do they > hav > e any non-standard characters? Do they have dots signifying glottalized > con > sonants? My only formal native language training is in Nez Perce, so are > th > ere any weird sounds in the Salish languages not in Sahaptians and Chinook > that were symbolized ad hoc? [Dave responds] Though there are some fun sounds in Interior Salish that're not in CJ or neighbouring languages, e.g. pharyngeals, there are no unique _Chinuk pipa_ characters for them. (Nonce adaptations are used instead, e.g. or .) Similarly there are nothing but hints that some writers may rarely have tried to represent glottalized resonants, e.g. by a digraph . (Note that glottalization, like stress and pitch, is a suprasegmental which is generally unrepresented in writing, esp. by L1 speakers. There's not even a shorthand character for glottal stop, and precious little evidence of writers bothering to invent one.) I wonder if some of Le Jeune's 1924 "Chinook Rudiments" letters were an attempt to represent such sounds, but no record of their use exists. [Dave] >> *In various implementations of the shorthand, diacritics which Le Jeune > >> never overtly explained or mentioned anywhere were used. Any attempt at > >> encoding every _Chinuk pipa_ character would have to add acute accent > >> marks (for Okanagan Salish), macrons (for "Stalo" Salish), breves, and a > >> character to approximate the voiceless rounded velar fricative /xW/. A > >> thankless task! > > > [Van] > Two things: first, I have already made arrangements to use standard > combini > ng diacritics for the I/E dots, and documenting the use of combining > acute, > macron, and breve will take me about two minutes once I see how they > opera > te. Second, I absolutely NEED to know the form of xW, so I can include it > i > n the repertoire. [Dave responds] Examples of the acutes, macrons and breves will be easy to provide in the form of scans. is effectively the letter (Le Jeune romanizes it as ) + the combining-dot diacritic. Since that diacritic is identical to shorthand , this is an intuitive solution: would sound similar to /ixW/, and in fact is most often used to represent that sequence. [Dave] >> **In practice most people most of the time segmented shorthand >> words > >> into syllables. In fact, there was remarkable unanimity in assigning > >> the syllable breaks. As a result I've thought that the sanest > >> solution to encoding _Chinuk pipa_ might be to create syllable > >> glyphs... [Van] > I think there are three problems with encoding the syllables directly. > Firs > t, it is not only unnecessary, given that syllables are joined > combinations > of letters, it wastes a large amount of allocation space for a script > that > is quite rare. Second, with I and sometimes U not forming independent > syll > ables, the number soon spirals out of control and leaves absolutely no > room > for non-standard or future anomalous syllabic breaking. Third, it sets a > h > orrible precedent for any future shorthand-derived writing systems that > hav > e similar syllabic breaks. [Dave responds] Do what works best! The ideal solution is an actual alphabetic encoding, since it was an explicit decision by _Chinuk pipa's_ creators that it be able to represent syllables of any complexity. (Unlike the Carrier-style syllabics, which were originally experimented with; it rapidly became clear that those were unwieldy for consonantal-cluster-ridden non-Athabaskan languages like Salish and CJ.) What appears to have been a huge selling point in making _Chinuk pipa_ as popular as it became was that you could write your own name, and the names written came from Salish, English and French--thus calling for quite a flexible writing system. ------------------------------------------------------------------------------ > Hi again Van, a couple even briefer responses: > > Though there are some fun sounds in Interior Salish that're not in CJ or > neighbouring languages, e.g. pharyngeals, there are no unique _Chinuk > pipa_ characters for them. (Nonce adaptations are used instead, e.g. > or .) I definitely need to see these, preferably with glosses. For those looking in from the Unicode side, the word "pipa" comes from English "paper" and means a letter, a book, writing, or paper. > Similarly there are nothing but hints that some writers may > rarely have tried to represent glottalized resonants, e.g. by a digraph > . Like a backwards Ng? > (Note that glottalization, like stress and pitch, is a > suprasegmental which is generally unrepresented in writing, esp. by L1 > speakers. There's not even a shorthand character for glottal stop, and > precious little evidence of writers bothering to invent one.) I wonder if > some of Le Jeune's 1924 "Chinook Rudiments" letters were an attempt to > represent such sounds, but no record of their use exists. The only thing I have seen that seems to indicate such is that for words like the Gibbs/Thomas Ko (arrive), Kow (tie), and Kull (hard), in the Chinook Rudiments, LeJeune seems to have a Pipa "hK" (K with leading H-dot) rather than his defined "Kh" (K with trailing H-dot) and he inconsistently glosses them as either *Kho, Khow, Khell; or K'o, K'ow, and K'ell. I have even seen what looks like a crossed "K" when writing Ko/Kho/K'o. (Kamloops Wawa No 101, pg 165, line 15) > Examples of the acutes, macrons and breves will be easy to provide in the > form of scans. Please email me a copy, so I can include the behavior in the proposal. I really just need an example or two with each vowel and diacritic (if the O/W vowels are the same, just a few examples will do for all) > is effectively the letter (Le Jeune romanizes it > as ) + the combining-dot diacritic. Since that diacritic is identical > to shorthand , this is an intuitive solution: would sound similar > to /ixW/, and in fact is most often used to represent that sequence. like this? http://www.geocities.com/vanisaac/pics/Uh.gif or more like this? http://www.geocities.com/vanisaac/pics/UH.gif > Do what works best! The ideal solution is an actual alphabetic encoding, > since it was an explicit decision by _Chinuk pipa's_ creators that it be > able to represent syllables of any complexity. (Unlike the Carrier-style > syllabics, which were originally experimented with; it rapidly became > clear that those were unwieldy for consonantal-cluster-ridden > non-Athabaskan languages like Salish and CJ.) What appears to have been a > huge selling point in making _Chinuk pipa_ as popular as it became was > that you could write your own name, and the names written came from > Salish, English and French--thus calling for quite a flexible writing > system. Having studied Japanese, and knowing the complexity of using a purely C+V syllabic system to try to write English loan words, I can appreciate just how wholly inadequate the Unified Canadian Aboriginal Syllabry was for Chinook Wawa. Thanks Dave, -Van ------------------------------------------------------------------------------ [Dave] >> a digraph . [Van] > Like a backwards Ng? [Dave responds] No, because the -dot is not situated inside the curve, but before it and spatially separate. [Dave] >> (Note that glottalization, like stress and pitch, is a >> suprasegmental which is generally unrepresented in writing, esp. by L1 >> speakers...I wonder if >> some of Le Jeune's 1924 "Chinook Rudiments" letters were an attempt to >> represent such sounds, but no record of their use exists. [Van] > The only thing I have seen that seems to indicate such is that for words > like the Gibbs/Thomas Ko (arrive), Kow (tie), and Kull (hard), in the > Chinook Rudiments, LeJeune seems to have a Pipa "hK" (K with leading > H-dot) rather than his defined "Kh" (K with trailing H-dot) and he > inconsistently glosses them as either *Kho, Khow, Khell; or K'o, K'ow, and > K'ell. I have even seen what looks like a crossed "K" when writing > Ko/Kho/K'o. (Kamloops Wawa No 101, pg 165, line 15) [Dave responds] You're astute and I'm glad, because with a bad cold I worded the above poorly (shoulda written "crosslinguistically unrepresented in writing" and "Le Jeune's 1924...letters/characters with diacritics"). I also forgot (dang it) some characters that are consistently used by Le Jeune and often by First Nations writers: is the form with a dot or more often a "tick" (tiny line) touching it from one side. usually represents an ejective velar (with or without labialization) as /k'aw/ "to tie", sometimes an ejective uvular as in /q'(W)u7/ ((W)=automatic labialization before rounded vowel, 7=glottal stop) "to arrive" and /q'@l/ (@=schwa) "hard". is the form crossed by a perpendicular tick. represents a plain uvular (with or without labialization). (Note: ejective uvulars but not ejective velars were also represented by the digraph .) is an explicitly non-/h/ fricative, usually voiceless, either velar or uvular. In form it's (excuse me) a tiny lightning bolt just like a Nazi "s" in the SS logo. In behaviour it's a non-connector like . Examples of the use of this character (always in etymologically First Nations words only) are /ixt/ 'one' and as one spelling for /qaX/ (X=uvular fricative) 'where?' I don't recall if you've already dealt with , i.e. voiceless lateral fricative. It's the form accompanied by a side tick. It was in frequent enough use, though the same sound was very commonly represented by digraphs as in (.=syllable boundary to show that the character was used, not ) /LaXawyam/ 'hello' or as in /paL/ 'full'. [Dave] >> Examples of the acutes, macrons and breves will be easy to provide in >> the >> form of scans. [Van] > Please email me a copy, so I can include the behavior in the proposal. I > really just need an example or two with each vowel and diacritic (if the > O/W vowels are the same, just a few examples will do for all) [Dave responds] Can do. What's your time window? I'm sorry to say how ridiculously busy I am... [Dave] >> is effectively the letter (Le Jeune romanizes it >> as ) + the combining-dot diacritic. Since that diacritic is >> identical >> to shorthand , this is an intuitive solution: would sound >> similar >> to /ixW/, and in fact is most often used to represent that >> sequence. [Van] > like this? http://www.geocities.com/vanisaac/pics/Uh.gif or more like > this? http://www.geocities.com/vanisaac/pics/UH.gif [Dave responds] The first one is dead on. ------------------------------------------------------------------------------ > No, because the -dot is not situated inside the curve, but before > it and spatially separate. Well, that makes it very simple. As far as the Unicode Standard is concerne d, it is simply an orthographic convention, like "ch", "th", "sh", etc in E nglish. > [Dave] >>> (Note that glottalization, like stress and pitch, is a >>> suprasegmental which is generally unrepresented in writing, esp. by L1 >>> speakers...I wonder if >>> some of Le Jeune's 1924 "Chinook Rudiments" letters were an attempt to >>> represent such sounds, but no record of their use exists. > [Van] >> The only thing I have seen that seems to indicate such is that for words >> like the Gibbs/Thomas Ko (arrive), Kow (tie), and Kull (hard), in the >> Chinook Rudiments, LeJeune seems to have a Pipa "hK" (K with leading >> H-dot) rather than his defined "Kh" (K with trailing H-dot) and he >> inconsistently glosses them as either *Kho, Khow, Khell; or K'o, K'ow, a nd >> K'ell. I have even seen what looks like a crossed "K" when writing >> Ko/Kho/K'o. (Kamloops Wawa No 101, pg 165, line 15) > [Dave responds] > You're astute and I'm glad, because with a bad cold I worded the above > poorly (shoulda written "crosslinguistically unrepresented in writing" an d > "Le Jeune's 1924...letters/characters with diacritics"). I do have a bit of a linguistics background. I was a Classics major and fil led a lot of my upper division requirement with linguistics/history of lang uage coursework. I won't put this proposal to bed until it's right, both fr om an encoding standpoint and from a linguistic standpoint. An incomplete p roposal is really worse than none at all. With an incomplete proposal, anyo ne wanting to fix it has to do so within the rules I already laid out, so I don't want to screw this up! > I also forgot (dang it) some characters that are consistently used by Le > Jeune and often by First Nations writers: > > is the form with a dot or more often a "tick" (tiny line) > touching it from one side. usually represents an ejective velar > (with or without labialization) as /k'aw/ "to tie", sometimes an > ejective uvular as in /q'(W)u7/ ((W)=automatic labialization befo re > rounded vowel, 7=glottal stop) "to arrive" and /q'@l/ (@=schwa ) > "hard". I have this encoded as a combined 'hK' - K with a left side dot. > is the form crossed by a perpendicular tick. represents a > plain uvular (with or without labialization). (Note: ejective uvulars bu t > not ejective velars were also represented by the digraph .) So there are two side ticks (left and right), and a through tick that are a ll distinct! That's going to require some thought and some digging through the texts. > is an explicitly non-/h/ fricative, usually voiceless, either velar o r > uvular. In form it's (excuse me) a tiny lightning bolt just like a Nazi > "s" in the SS logo. In behaviour it's a non-connector like . Example s > of the use of this character (always in etymologically First Nations word s > only) are /ixt/ 'one' and as one spelling for /qaX/ (X=uvul ar > fricative) 'where?' I will make sure to include it in the character repertoire, and once I see the character in text, I will make a picture for the online proposal docume nt and include the glyph in my outline font. > I don't recall if you've already dealt with , i.e. voiceless lateral > fricative. It's the form accompanied by a side tick. It was in > frequent enough use, though the same sound was very commonly represented > by digraphs as in (.=syllable boundary to show that t he > character was used, not ) /LaXawyam/ 'hello' or as in > /paL/ 'full'. Yes, I have already made arrangements for the representation of both 'hL' ( left side tick) and lateralized L as 'Lh' (right side tick). LeJeune includ ed these on page 5 of the Rudiments text, along with KR, etc. > [Dave] >>> Examples of the acutes, macrons and breves will be easy to provide in >>> the >>> form of scans. > [Van] >> Please email me a copy, so I can include the behavior in the proposal. I >> really just need an example or two with each vowel and diacritic (if the >> O/W vowels are the same, just a few examples will do for all) > [Dave responds] > Can do. What's your time window? I'm sorry to say how ridiculously busy > I am... To answer that well, I would need to know what timelines are like for the U TC. Do we think this proposal can be in good enough shape by their next mee ting, and when is that meeting? Basically, if you have a scan of one docume nt with diacritics - and that writer uses them like pretty much everyone el se does - I can probably put together a good working draft on all the behav iours (gotta remember to use my Canadian spellings), and get it in the next version of the draft proposal. > [Van] >> like this? http://www.geocities.com/vanisaac/pics/Uh.gif or more like >> this? http://www.geocities.com/vanisaac/pics/UH.gif > [Dave responds] > The first one is dead on. Good. I already had it taken care of. -Van --------------------------------------------------------------------------- ----- The next version of the draft proposal is online at the same address as bef ore: http://www.geocities.com/vanisaac/Chinook.html Don't worry, a link at the top will take you to the previous version, I haven't deleted it. Changes from version 3.1 to version 3.2 are: 1) Expansion of the standard combining behaviour of H to include modifying H-dots with line consonants. 2) New Chinook Combining Middle Dot for use with N, Sh, S, and U instead of VS1. 3) Code point allocated for Salish X, glyph image to be added later. 4) Proper Combining Diacritic Marks for vowel pronunciation. 5) VS1 usage reduced to I/E distinctions (see questions below) 6) Conjoining behaviours now encoded with currently undefined "conjoining b ehaviour selector" instead of VS2. 7) Displaced rendering now default for nasal vowels, with syllable breaks e ncoding in-line placement. 8) Syllable forming rules now placed at end of text, rules refined, and mul tiple examples given for most rules. X) Still in the process of including elucidating examples of Initial, Media l, and Final forms of circle vowels. Questions to figure out before final proposal: 1) How should I encode the overlapping abbreviation behaviour of consonants and the overlapping-combining behaviour of circle vowels? ie what is the C onjoining Selector? 2) Should Wa and Wi be composed characters or retain separate code points? 3) Should E remain a variant of I, be given its own code point, or the I/E distinction not encoded in plain text? 4) What exactly is the behaviour of combining acute, macron, and breve when the Chinook script is used to write Salish languages? ------------------------------------------------------------------------------ Dear Sir, Since you have prepared this proposal, could you please let me know how could I prepare a similar proposal for say Linear A? In particular, I am interested in the bureaucratic part of the application process (e.g., how it starts, to whom it should be addressed, etc). Sincerely, Apostolos Syropoulos -- Apostolos Syropoulos 366, 28th October Str. GR-671 00 Xanthi, GREECE Web-page at http://obelix.ee.duth.gr/~apostolo Blogs at http://asyropoulos.wordpress.com/ http://hypercomputation.blogspot.com/ ------------------------------------------------------------------------------ Minor points... In the latest updated proposal you wrote: > the archives of the /Kamloops Wawa/, written in the Chinook script, > includes a considerable dictionary However, later you write: > Ordering of the characters in the Chinook script is undefined, so > allocation order in the Chinook Character Block is revisable up to > inclusion in the standard. Does the above-cited dictionary not provide an ordering? Or at least, the semblance of one? Or do you mean that dictionary is in Latin order? When I view the chart labeled "Character Sequences: Combining Diacritical Marks on Vowels" there is one broken image in row 5, "Tep". Rick ------------------------------------------------------------------------------ > Minor points... In the latest updated proposal you wrote: >> the archives of the /Kamloops Wawa/, written in the Chinook script, >> includes a considerable dictionary > > However, later you write: > >> Ordering of the characters in the Chinook script is undefined, so >> allocation order in the Chinook Character Block is revisable up to >> inclusion in the standard. > > Does the above-cited dictionary not provide an ordering? Or at least, > the semblance of one? Or do you mean that dictionary is in Latin order? Sorry for the confusion, but it is important to clear these things up. Yes, you are correct. I say so in the section on alphabetization, but I have revised the above to say "Ordering of the characters in the Chinook script is undefined - the only Chinook script lexicon cites in Latin alphabetical order - so allocation order in the Chinook Character Block is revisable up to inclusion in the standard." > When I view the chart labeled "Character Sequences: Combining > Diacritical Marks on Vowels" there is one broken image in row 5, "Tep". Hmmm, it comes out just fine for me. I can't see anything wrong with the code. I will re-upload the image and see if that changes anything. Was the image missing, or did it seem jumbled up somehow? > Rick -Van PS, I think I may have figured out the solution to the I/E problem and the conjoining behaviours for vowels. I need to do some experimenting to make sure all compound vowel forms can be realized, but it will leave us with only three outstanding issues: encoding overlapping consonants, the shape of /x/, and behaviour of combining acute, breve, and macron. ------------------------------------------------------------------------------ I got a missing image icon. It's no longer reproducible, so you must have fixed it, or it was an anomaly. Rick > >> When I view the chart labeled "Character Sequences: Combining >> Diacritical Marks on Vowels" there is one broken image in row 5, "Tep". >> > > Hmmm, it comes out just fine for me. I can't see anything wrong with the code. I will re-upload the image and see if that changes anything. Was the image missing, or did it seem jumbled up somehow? > ------------------------------------------------------------------------------ > So there are two side ticks (left and right), and a through tick that are > a > ll distinct! That's going to require some thought and some digging through > the texts. I disagree. The side tick, whether on or on , in practice seems to freely vary between the left and right sides of the or form. But yes, the through tick is distinct from the side tick. ------------------------------------------------------------------------------ There's no established alphabetical order for _Chinuk pipa_'s characters. Le Jeune 1924 is arranged thematically rather than alphabetically. In some presentations of the writing system, e.g. in Kamloops Wawa, Le Jeune made approximate use of the English/French alphabetical order. That's as close as he came to formalizing an order. Incidentally, _Chinuk pipa_ letters didn't have conventionalized names either. Not in the sense that "A" in English is /ei/ or "Z" in Canadian English is /zed/. But Le Jeune once or twice referred to the shorthand letters (in Chinook Jargon) as "ii (/ei/), bi, si, di" etc. The problems with this are that he didn't name all the shorthand letters (what might he have called ?), and that many of these names have no intuitive correspondence with the actual broadly-phonetic use of the characters (e.g. "ii" = shorthand , "si" = shorthand ). --Dave R. > Minor points... In the latest updated proposal you wrote: >> the archives of the /Kamloops Wawa/, written in the Chinook script, >> includes a considerable dictionary > > However, later you write: > >> Ordering of the characters in the Chinook script is undefined, so >> allocation order in the Chinook Character Block is revisable up to >> inclusion in the standard. > > Does the above-cited dictionary not provide an ordering? Or at least, > the semblance of one? Or do you mean that dictionary is in Latin order? > > When I view the chart labeled "Character Sequences: Combining > Diacritical Marks on Vowels" there is one broken image in row 5, "Tep". > > Rick > > --------- > vanisaac@boil.afraid.org wrote: >> The second draft of the preliminary proposal to encode the Chinook >> script (Chinuk Pipa) in the Unicode Standard can be found online at >> >> http://www.geocities.com/vanisaac/Chinook.html >> > ------------------------------------------------------------------------------ >> So there are two side ticks (left and right), and a through tick that ar e >> a >> ll distinct! That's going to require some thought and some digging throu gh >> the texts. > > I disagree. The side tick, whether on or on , in practice seems > to freely vary between the left and right sides of the or form. > > But yes, the through tick is distinct from the side tick. Shoot, I thought I was really close to getting this one. Ok, I'll try to he ad to the library on Monday and see if I can take a good look at some of th e later volumes of KW texts and see if I can get the texts to coordinate wi th the Chinook Rudiments. -Van --------------------------------------------------------------------------- ---- I believe I have also got the vowel problem figured out. The costs are no m ore differentiation of "wey" and "wi", "wa" and "wi" are no longer precompo sed, and separate code points are allocated for "I" and "E". The benefits a re absolutely no variation selectors used with vowels, only minimal theoret ical need of ZWJ and ZWNJ, and all forms of compound vowels, except "wey", are composable, as well as normalized. To achieve this, Chinook vowels have a yes/no value for two properties: can they be the base of a composed character? and can they combine into a comp osed character? The following table lists the characters and their values: O -- +base, +combining A -- -base, +combining Oo -- -base, +combining Ow -- +base, +combining E -- -base, +combining I -- -base, -combining U -- -base, -combining "I", however, does join cursively with 'O', 'A', ('E'?) and 'I'. "U" does n ot join or combine with any vowels. The following table lists all compound vowels, Y+vowels, and vowel+I sequen ces: Wo = O + O Wa = O + A Woo = O + Oo Wow = O + Ow Wi = O + E Oi = O + I OhWa = O + O + A OwAh = Ow + A Weyi = O + E + E Wia = O + E + A Ai = A + I Wai = O + A + I Ya = I + A Yo = I + O Yoo = I + Oo (usually just "U") Yow = I + Ow Ye = I + I (or I + E) Ei = E + I So we are now left with the one remaining need for some sort of selector, t hat is to represent conjoined consonant abbreviations. I'll keep thinking a nd searching for the best implementation on that one, but it looks like I h ave to do some text diving for a while. I'll see if I can get the vowel rev ision up over this next weekend, and see where we are next week. Thanks for everyone's help, thoughts, criticism, encouragement, and experti se - especially you, Dave. I'll be back in touch by Monday. Van Anderson ------------------------------------------------------------------------------ Van Anderson wrote, > So we are now left with the one remaining need for some sort of selector, t > hat is to represent conjoined consonant abbreviations. I'll keep thinking a > nd searching for the best implementation on that one, ... Quoting from the proposal, ( http://www.geocities.com/vanisaac/Chinook.html ) “Most letters have variant forms, including the addition of ancillary dots, compounding of vowels, and overlapping combining behaviors for initialisms and abbreviations.” (I am not “up-to-speed” on this writing system.) If the three abbreviations shown in the proposal are the only abbreviations used, would it be desirable to simply propose them as atomic characters? If there are many possible abbreviations (open-ended), then perhaps a new character could be added to the proposal. (CHINOOK ABBREVIATION INDICATOR?) Best regards, James Kass ------------------------------------------------------------------------------ 2009/2/19 Thank you very much for your detailed and very informative response! This was something I was looking for a long time. It is my understanding that this is a time consuming process, but one that I plan to undertake in the near future unless something else comes up. Kind regards, Apostolos PS Obviously, you have visited Greece;-) -- Apostolos Syropoulos 366, 28th October Str. GR-671 00 Xanthi, GREECE Web-page at http://obelix.ee.duth.gr/~apostolo Blogs at http://asyropoulos.wordpress.com/ http://hypercomputation.blogspot.com/ ------------------------------------------------------------------------------ I’ve just sent you the pdfs you were looking for from Kamloops Wawa. Please let me know if you received them as your e-mail client might not have liked the attachments. Chris ------------------------------------------------------------------------------ Here are the scans you requested. I’ve also included two scans from “Canadian Savage Folk” John MacLean (1896). While certainly a very closed-minded book, it does have a couple of examples of Duployé shorthand. Note that La Jeune was also the compiler of the Thompson shorthand text. ------------------------------------------------------------------------------ -----Original Message----- > Here are the scans you requested. I’ve also included two scans from > “Canadian Savage Folk” John MacLean (1896). While certainly a very > closed-minded book, it does have a couple of examples of Duployé > shorthand. Note that La Jeune was also the compiler of the Thompson > shorthand text. Chris, thank you so much for all this great material. Fortunately, my email server is run by one of my best friends (we're actually driving Washington, DC -> Halifax -> Vancouver this summer), and he would never reduce the utility of my email by blocking attachments. Deborah, thank you for getting me in touch with Chris, and my best to you and the SEI. This should have us pretty much set for documentation, and I am indebted to you both for that. I don't know if you've been watching the Chinook list or not, but I am also in contact with Dave Robertson of the University of Victoria. Not only is he pretty much the authority on the Wawa writing, he is getting me examples of Chinook script usage for the Thompson, Lilooet, Shuswap, and other Salishan languages, so I can include all of the necessary characters, variants, and diacritic behaviours with the proposal as well. Again, my thanks to both of you, Van Isaac Anderson ------------------------------------------------------------------------------ Ysgrifennodd vanisaac@boil.afraid.org 2009/02/19 4:54 p.m. > I don't know if you've been watching the Chinook list or not, but I am also > in contact with Dave Robertson of the University of Victoria. Not only is > he pretty much the authority on the Wawa writing, he is getting me examples > of Chinook script usage for the Thompson, Lilooet, Shuswap, and other Sali > shan languages, so I can include all of the necessary characters, variants, > and diacritic behaviours with the proposal as well. I wouldn’t mind being CC-ed on the examples of the usage for Thompson, Lilooet, etc. Which begs the question, should this in fact be called the Chinook script? Chris ------------------------------------------------------------------------------ -----Original Message----- > Ysgrifennodd vanisaac@boil.afraid.org 2009/02/19 4:54 p.m. >> I don't know if you've been watching the Chinook list or not, but I am also >> in contact with Dave Robertson of the University of Victoria. Not only is >> he pretty much the authority on the Wawa writing, he is getting me examples >> of Chinook script usage for the Thompson, Lilooet, Shuswap, and other Sali >> shan languages, so I can include all of the necessary characters, variants, >> and diacritic behaviours with the proposal as well. > > I wouldn’t mind being CC-ed on the examples of the usage for Thompson, > Lilooet, etc. Which begs the question, should this in fact be called > the Chinook script? > > Chris The writers using the script had two names for the script, either "Wawa writing" or "Chinook Pipa", meaning "Chinook writing" in the Chinook Jargon. It did not matter if they were using it to write Chinook Jargon, English, Shushwap, Thompson, or whatever. But the question does, in fact, remain as to what the script should be called. This is not a matter that has escaped my attention. I have honestly had far greater structural concerns to iron out in this script, but the script name question will not be ignored before the final proposal is submitted. If you sign up for the Chinook list discussion at http://evertype.com/mailman/listinfo/chinook_evertype.com you can get the Salish texts at the same time I do. -Van ------------------------------------------------------------------------------ I was sick today, and in between sleeping and coughing, I managed to put to gether version 3.3 of the Chinook proposal. Again, you can access the propo sal at http://www.geocities.com/vanisaac/Chinook.html Differences between versions 3.2 and 3.3 are: 1) Elimination of precomposed compound vowels "Wa" and "Wi". 2) Vowel properties for the formation of compound vowels. No control charac ters! 3) Addition of CHINOOK LETTER E, with compounding and joining properties di stinct from "I". 4) Preliminary glyph image for U+x1C, CHINOOK LETTER X. - Dave, let me know . 5) Addition of small material on usage with Salishan languages. 6) VS1 currently retired from the proposal. 7) Addition of compound vowel character sequences. Questions still to be addressed: 1) What is the best means of encoding the Conjoining Selector? 2) How do combining macron, breve, and acute work with the Chinook script? 3) Should the script be called "Chinook", "Chinuk Pipa", "Duployean", or .. .? Again, version 3.3 of the preliminary Chinook proposal can be found at http://www.geocities.com/vanisaac/Chinook.html -Van ------------------------------------------------------------------------------ There are more abbreviations in actual and even common use. For example (out of habit from discussing these in my diss., I capitalize the abbreviations): "North Bend" "Kamloops" "Salmon Arm" <$K> (where $=the hushing fricative "sh" character) "Shhkaltkmah", the modern Sahhaltkum (sp?) reserve. Those last two bring up the observation that not all abbreviations are constructed by the same mechanism. The general mechanism is to overlay the symbols over one another--"crossing" each other. Yet <$K> for "Shhkaltkmah" is written with the first letter segueing cursively directly into the second. The reason: to distinguish this abbreviation from the certainly earlier-vintage <$K> "Jesus Christ". (Which dates back to the original French-language Duploye shorthand.) And I recall as being written cursively. Like the <$K> village name, I should point out, this still stands out as an abbreviation due to its lack of vowels. No words in Chinook Jargon were vowelless (though quite a few in Interior Salish languages were). Another wrinkle in writing abbreviations in _Chinuk pipa_ is that they're not always made of 2 letters. For one thing, you'll find the pretty common for "North Thompson" (a village whose mission was St. John/Jean the Baptist, thus the initials). Here, it's not the case that all 3 letters overlay one another. Instead, the and <$>, being mirror images, touch at the er 'peak' of their curves, with the perpendicular stroke of the bisecting both. And there's for "et cetera", pretty frequently used. This one begins with the cursive sequence , which is then crossed by the . My impression is that the cursively written abbrs. can be perfectly handled by the normal rules of letter-sequencing of _Chinuk pipa_.* But the non-cursive "crossed-over" abbrs. may well require separate characters. *A question I've always wanted to ask of anyone who takes on the task of encoding _Chinuk pipa_: will you genuinely be able to show on a computer screen any possible sequence of characters? Some words of Chinook Jargon have a decidedly downward trajectory in the shorthand, so I've wondered whether this would play havoc with e.g. line spacing. --Dave R PS: I reckon there are even more abbreviations attested. I just don't recall all of them right now. Should be able to check my research materials for a fuller list. > > Van Anderson wrote, > >> So we are now left with the one remaining need for some sort of >> selector, t >> hat is to represent conjoined consonant abbreviations. I'll keep >> thinking a >> nd searching for the best implementation on that one, ... > > Quoting from the proposal, > ( http://www.geocities.com/vanisaac/Chinook.html ) > > “Most letters have variant forms, including the addition > of ancillary dots, compounding of vowels, and overlapping > combining behaviors for initialisms and abbreviations.” > > (I am not “up-to-speed” on this writing system.) > > If the three abbreviations shown in the proposal are the only > abbreviations used, would it be desirable to simply propose > them as atomic characters? > > If there are many possible abbreviations (open-ended), then > perhaps a new character could be added to the proposal. > (CHINOOK ABBREVIATION INDICATOR?) > > Best regards, > > James Kass > ------------------------------------------------------------------------------ > 3) Should the script be called "Chinook", "Chinuk Pipa", "Duployean", or I definitely agitate for _Chinuk pipa_. No other name for it has enjoyed the kind of recognition that this did. And this is the name for the writing system in the language (Chinook Jargon) most characteristically associated with it. That is to say, while 8 Salish languages and even bits of Latin, English, French, Greek, Cree, and "Montagnais" Athabaskan were also written in this system, the vast majority of material written in it was CJ. "Duployan" is unsuitable because not distinct from the original French-language Duploye shorthand with its different vowel and consonant inventory. _Chinuk pipa_ is a modification of that system and would not be entirely legible to a user of the original (and popular) French system. "Chinook" or even the English-language calque of _Chinuk pipa_, "Chinook writing", have never enjoyed currency as names for this writing system. Some folks have referred to it by still other names including "the shorthand", "the Wawa writing", even inaccurately "syllabics". None of these has broad currency and some are very inaccurate. --Dave R ------------------------------------------------------------------------------ > There are more abbreviations in actual and even common use. For example > (out of habit from discussing these in my diss., I capitalize the > abbreviations): > > "North Bend" > > "Kamloops" > > "Salmon Arm" > > <$K> (where $=the hushing fricative "sh" character) "Shhkaltkmah", the > modern Sahhaltkum (sp?) reserve. I have also seen what looks like a or perhaps in No 1, the suplem ental "Prayers in Shushwap". It appears to be the Shushwap language version of . Also there appears to be an abbreviation. > Those last two bring up the observation that not all abbreviations are > constructed by the same mechanism. The general mechanism is to overlay > the symbols over one another--"crossing" each other. > > Yet <$K> for "Shhkaltkmah" is written with the first letter segueing > cursively directly into the second. The reason: to distinguish this > abbreviation from the certainly earlier-vintage <$K> "Jesus Christ". > (Which dates back to the original French-language Duploye shorthand.) And because we have a contrastive use of <$K> and <$.K>, it means I can't p unt the issue of encoding abbreviations. I need to figure out how to encode the overlapping behaviour because it is linguistically significant. > And I recall as being written cursively. Like the <$K> village name , > I should point out, this still stands out as an abbreviation due to its > lack of vowels. No words in Chinook Jargon were vowelless (though quite a > few in Interior Salish languages were). > > Another wrinkle in writing abbreviations in _Chinuk pipa_ is that they're > not always made of 2 letters. For one thing, you'll find the pretty > common for "North Thompson" (a village whose mission was St. > John/Jean the Baptist, thus the initials). Here, it's not the case that > all 3 letters overlay one another. Instead, the and <$>, being mirro r > images, touch at the er 'peak' of their curves, with the perpendicular > stroke of the bisecting both. All these are details that an implementer would want to know, and it should be included with the documentation, but from an encoding point of view, it is already handled. > And there's for "et cetera", pretty frequently used. This one > begins with the cursive sequence , which is then crossed by the . > > My impression is that the cursively written abbrs. can be perfectly > handled by the normal rules of letter-sequencing of _Chinuk pipa_.* But > the non-cursive "crossed-over" abbrs. may well require separate > characters. They will require an encoding mechanism, no more. Distinct characters would only be justified if their alternate behaviour constituted a new "letter". > *A question I've always wanted to ask of anyone who takes on the task of > encoding _Chinuk pipa_: will you genuinely be able to show on a computer > screen any possible sequence of characters? Some words of Chinook Jargon > have a decidedly downward trajectory in the shorthand, so I've wondered > whether this would play havoc with e.g. line spacing. That's up to the implementers. Unicode is about encoding linguistically sig nificant plain text data, no more. My own personal voyage into Chinook typo graphy has brought me to the realization that Chinook text should have a lo t of leading (as in the metal) so adjacent lines do not conflict. > --Dave R > > PS: I reckon there are even more abbreviations attested. I just don't > recall all of them right now. Should be able to check my research > materials for a fuller list. I think the only example I need for the proposal is the <$K> for "Shhkaltkm ah" Dave, I had a question for you about a text. I am looking at one of LeJeune 's texts titled "Prayers in Thompson", and have found this syllable: http:/ /www.geocities.com/vanisaac/pics/Sa-box.gif which looks like plus a bo x. Can you help me out with this one? Also, I have my provisional /X/ creat ed http://www.geocities.com/vanisaac/pics/X.png Is that close? -Van ------------------------------------------------------------------------------ Hi! One of the tables has the heading "Normative Glyph Shapes". Since the glyph shapes are not normative (in the standardese sense), this heading is misleading. Maybe you meant nominal glyph shapes, and that would in general be fine, especially when there is a lot of shaping expected as in this case. But the table is a character chart, and those glyphs are referred to as chart glyphs (in particular the combining character glyph is specially made for character charts). Kind regards /Kent Karlsson Den 2009-02-20 05.40, skrev "vanisaac@boil.afraid.org" : > The latest version (3.3) of the preliminary/exploratory Chinook script > proposal can be found at > > http://www.geocities.com/vanisaac/Chinook.html > > > Please do NOT respond to this list with comments. Please sign up with the > public Chinook in the UCS mail list at > http://evertype.com/mailman/listinfo/chinook_evertype.com to join the > conversation. > > > Again, version 3.3 of the preliminary Chinook proposal can be found at > > http://www.geocities.com/vanisaac/Chinook.html > > > -Van Anderson > ------------------------------------------------------------------------------ I think those who are interested have already joined the discussion list. On 20 Feb 2009, at 04:40, vanisaac@boil.afraid.org wrote: > The latest version (3.3) of the preliminary/exploratory Chinook > script proposal can be found at > > http://www.geocities.com/vanisaac/Chinook.html > > > Please do NOT respond to this list with comments. Please sign up > with the public Chinook in the UCS mail list at > http://evertype.com/mailman/listinfo/chinook_evertype.com to join > the conversation. > > > Again, version 3.3 of the preliminary Chinook proposal can be found at > > http://www.geocities.com/vanisaac/Chinook.html > > > -Van Anderson Michael Everson * http://www.evertype.com ------------------------------------------------------------------------------ > I have also seen what looks like a or perhaps in No 1, the > suplem > ental "Prayers in Shushwap". It appears to be the Shushwap language > version > of . Also there appears to be an abbreviation. You're right about both. is Shuswap , 'God'. is short for Latin 'Sanctus Spiritus'. > Dave, I had a question for you about a text. I am looking at one of > LeJeune > 's texts titled "Prayers in Thompson", and have found this syllable: > http:/ > /www.geocities.com/vanisaac/pics/Sa-box.gif which looks like plus a > bo > x. Can you help me out with this one? Looks like to me. Can you tell me what word you found this in? I vaguely recall that this is a use of a doubled vowel to represent a sequence like /a9/ (where /9/ is a voiced pharyngeal approximant). >Also, I have my provisional /X/ > creat > ed http://www.geocities.com/vanisaac/pics/X.png Is that close? That's close. It just needs to be tiny. I can't tell what size it is. :) ------------------------------------------------------------------------------