summaryrefslogtreecommitdiff
path: root/README
blob: f92bcdf3966c0df675cb6bfd1f52f7241797f41d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
pp2cc
=====
pp2cc is a C compiler for the PP2 Practicum Processor. This processor (board)
is created by dr.ir. Rob Hoogerwoord from the University of Technology in
Eindhoven (TU/e). The resulting assembly can be compiled to machine code using
Assembler.jar

This software was written for the Design Based Learning project "Embedded
Systems" in 2011 Q2. The assignment required assembly which could be written by
hand. Xander Houtman thought of dropping that idea and using a C compiler
instead. Since his string-parsing approach did not work very well, I (Peter Wu)
looked for a parser library and found pycparser.

Dependencies (tested versions in parentheses:

- Python (3.3.3, 2.7.6 recommended) - http://python.org/download/
- pycparser (v2.10, v2.05 recommended) - https://github.com/eliben/pycparser
- PLY (Python Lex-Yacc; 3.4) - http://www.dabeaz.com/ply/ (bundled with recent
  pycparser)

Recommended:
- cpp - a C preprocessor

Python is often installed by default on Linux distributions. For Windows, you
probably need to run the installer from the above link. Extract the pycparser
ZIP file (see link above) and copy the pycparser subdirectory to the directory
containing pp2cc.py. pp2cc is a console program, you need to open a terminal
(or cmd on Windows) to see compiler messages.

Originally written for Python 2, pycparser v2.05 and external PLY. Meanwhile,
the same code has worse performance (21 C files, 618 lines in total, 3 trials)
in newer versions of the dependencies:

- python 3.3.3, pycparser v2.09.1 (and v2.10), bundled PLY: 26 seconds
- python 2.7.6, pycparser v2.09.1 (and v2.10), bundled PLY: 20 seconds
- python 3.3.3, pycparser v2.05, PLY 3.4 (external): less than a second
- python 2.7.6, pycparser v2.05, PLY 3.4 (external): less than a second

Since the bundled version equals the external version (3.4), pycparser seems to
have some issues with YACC (well, that shows up in profiling).

The cpp program is installed on most Linux distributions. If not, install a
conforming cpp, say gcc (GNU C Compiler). On Windows, you might want to
download mcpp (binary package) from http://mcpp.sourceforge.net/download.html.
Put bin\mcpp.exe in your %PATH% (or the directory containing pp2cc.py) and
rename it to "cpp.exe". If you do not install a C preprocessor, you need to
pass the --no-cpp option in order to skip the processing through cpp. You will
not be able to use comments or macros in this case.

Usage: pp2cc.py [options] filename..
Multiple input files can be specified, options can be specified before and
after filenames.
Options:
    -o filename     The name of the output file. It defaults to the first
                    input file from which the extension is removed, and .asm is
                    added.
    --tree          Instead of compiling the file into assembly, show the parse
                    tree.
    -D name
    -D name=definition This option is passed to the cpp program, and acts like
                    adding #define name or #define name definition respectively
    -U name         This option is passed to the cpp program and acts like
                    adding #undef name
    --no-cpp        Disable the use of the C Preprocessor

Conformance with the K&R interpretation of C and feature support
================================================================
A2.2 Comments - unsupported by the parser, but the preprocessor removes them
A2.5 Constants - integers (hexadecimal 0x1a, 0X1A, octal 07, decimal 1), signed
            only Treating integers as unsigned is undefined. Unsupported:
            floats, char, string literals.
A4 Identifiers - functions, variables (int) are supported. struct, union are
            not yet supported
A4.1 Storage class - static is supported (actually, everything is static).
            Automatic variables are supported.
A4.2-A4.3 Types - supported: int; unsupported: char, enum, float, double,
            struct, union, pointers. Not supported: arrays
A4.4 Type qualifiers - volatile won't be supported, const is unsupported
A5 Objects and lvalues - not explicitly used, but assignment works
A6 Conversions of operands - not supported since everything is treated as
            signed int

Expressions support
-------------------
precedence and associativity - supported by underlying library for parsing
A7.1 Pointer generation - not tested/supported
A7.2 Primary Expressions - identifiers, constants (int) are stored in a
            register. String is unsupported
A7.3 Postfix Expressions - unsupported
function calls - supported, result is stored in register R0 (for int functions)
            Other types (void) are not checked when using their value. If an
            undefined function is used, it'll still try to branch to the label
            Parameters are supported. Pointers to functions are unsupported.
            Recursive function calls are supported
A7.4 Unary Operators - supported: ++ -- + - ~ ! & * Unsupported: sizeof
A7.5 Casts - unsupported and ignored
A7.6-A7.7, A7.9-7.13 - supported: * / % + - < > <= >= == != & ^ | Left to right
A7.8 Shift - << >> is supported, if the second operand is negative, no shift
will be performed. The shift treats the first operand as an unsigned integer.
Example in bits assuming 4-bit words: 1000 (-8) >> 2 becomes 0010 (2), 0110 (6)
<< 1 becomes 1100 (-4)
A7.14-A7.15 Logical AND && and OR || are supported. Result is indeed 0 or 1
A7.16 Conditional operator ? : - supported
A7.17 Assignment - supported: = *= /= %= += -= &= ^= |= <<= >>=
            Supported for variable names only, pointers and array references
A7.18 Comma - supported
A7.19 Constant expressions - not checked

A8 Declarations
A8.1 Storage class specifiers - static function and global variables are
            supported, static local variables aren't. auto is implied in
            functions
A8.2 Type specifiers - unsupported, everything is assumed to be int. void in
            the meaning of "no value" is not checked. Const is not meaningfully
            supported yet
A8.3 Structure and union declarations - unsupported
A8.4 Enumerations - unsupported
A8.5 Declarators - pointers and qualifiers are ignored, only a direct name is
            supported. Unsupported: array
A8.6 Meaning of declarators - pointer is supported, array is not. function is
            supported without parameters and assumed to be an int function.
A8.7 Initialization - Supported for an expression resulting in an int. Arrays
            are unsupported

A8.8 Type names - not verified
A8.9 Typedef - won't be supported as we have int only
A8.10 Type equivalence - not supported as everything is an int

A9 Statements
A9.1 Labeled statements - supported
A9.2 Expression statement - supported by parser
A9.3 Compound statement - supported
A9.4 Selection statements - if and if/else are supported. switch is also
            supported including fallthrough and case/ default labels support
A9.5 Iteration statements - while, do/while and for are supported. A missing
            second expression in the for is equivalent to a non-zero constant
A9.6 Jump statements - goto, continue and break are supported. return is
            supported with and without value. The function result is
            undefined for the second case

A10 External Declarations
A10.1 Function Definitions - unsupported: extern static, parameters. Old style
            parameter list is unsupported
A10.2 External declarations - unsupported/unchecked

A11 Scope and Linkage
A11.1 Lexical scope - objects (variables) and functions use the same namespace
            conforming to the Standard
A11.2 Linkage - not applicable / unsupported

A12 Preprocessing
Not supported by parser, use a dedicated preprocessor like cpp and enable
removal of comments. Example: cpp -P file.c

A13 Grammar
Not checked