Language support Digraphs and trigraphs




1 language support

1.1 algol
1.2 pascal
1.3 vim
1.4 gnu screen
1.5 j
1.6 c
1.7 c++

1.7.1 removal of trigraphs


1.8 rpl





language support

different systems define different sets of digraphs , trigraphs, described below.


algol

early versions of algol predated standardised ascii , ebcdic character sets, , typically implemented using manufacturer-specific six-bit character code. number of algol operations either lacked codepoints in available character set or not supported peripherals, leading number of substitutions including := ← (assignment) , >= ≥ (greater or equal).


pascal

pascal programming language supports digraphs (., .), (* , *) [, ], { , } respectively. unlike other cases mentioned here, (* , *) , still in wide use.


vim

vim text editor supports digraphs actual entry of text characters, following rfc 1345.


gnu screen

gnu screen has digraph command, bound ^a ^v default.


j

the j programming language descendant of apl using ascii character set rather traditional apl symbols. handle fact printable range of ascii smaller apl s specialised set of symbols, . (dot) , : (colon) characters used inflect ascii symbols, interpreting unigraphs, digraphs or trigraphs standalone symbols .


unlike use of digraphs , trigraphs in c , c++, there no single-character equivalents these in j.


c

the c preprocessor replaces occurrences of following 9 trigraph sequences single-character equivalents before other processing.



a programmer may want place 2 question marks yet not have compiler treat them introducing trigraph. c grammar not permit 2 consecutive ? tokens, places in c file 2 question marks in row may used in multi-character constants, string literals, , comments. particularly problem classic mac os, constant ???? may used file type or creator. safely place 2 consecutive question marks within string literal, programmer can use string concatenation ...? ?... or escape sequence ...?\?... .


??? not trigraph sequence, when followed character such - interpreted ? + ??-, in example below has 16 ?s before /.


the ??/ trigraph can used introduce escaped newline line splicing; must taken account correct , efficient handling of trigraphs within preprocessor. can cause surprises, particularly within comments. example:



// next line executed????????????????/
a++;

which single logical comment line (used in c++ , c99), and



/??/
* comment *??/
/

which correctly formed block comment.



in 1994, normative amendment c standard, included in c99, supplied digraphs more readable alternatives 5 of trigraphs. listed in table on right.


unlike trigraphs, digraphs handled during tokenization, , digraph must represent full token itself, or compose token %:%: replacing preprocessor concatenation token ##. if digraph sequence occurs inside token, example quoted string, or character constant, not replaced.



c++

c++ (through c++14, see below) behaves c, including c99 additions, additional tokens listed in table.


as note, %:%: treated single token, rather 2 occurrences of %:.


the c++ standard makes comment regards term digraph :



the term digraph (token consisting of 2 characters) not descriptive, since 1 of alternative preprocessing-tokens %:%: , of course several primary tokens contain 2 characters. nonetheless, alternative tokens aren’t lexical keywords colloquially known digraphs .




removal of trigraphs

trigraphs proposed deprecation in c++0x, released c++11. opposed ibm, speaking on behalf of , other users of c++, , result trigraphs retained in c++0x. trigraphs proposed again removal (not deprecation) in c++17. passed committee vote, , trigraphs removed c++17 despite opposition ibm. existing code uses trigraphs can supported translating source files (parsing trigraphs) basic source character set not include trigraphs.


rpl

hewlett-packard calculators supporting rpl language , input method provide support large number of trigraphs (also called tio codes) reliably transcribe non-seven-bit ascii characters of calculators extended character set on foreign platforms, , ease keyboard input without using chars application.









Comments

Popular posts from this blog

Weir report (2001) Rotherham child sexual exploitation scandal

Mussolini's views on antisemitism and race Benito Mussolini

Types Classification yard