1.37M
Category: programmingprogramming

Эпизод III: Свой язык программирования

1.

Эпизод III:
Свой язык
программирования

2.

Литература
Dragon book

3.

Компилятор – это транслятор
High level
language
compiler
Low level
language
A compiler translates high-level programs to low-level program

4.

Компилятор – это транслятор
C
gcc
x86
GCC translates C programs to object code for X86 (and other architectures

5.

Компилятор – это транслятор
Java
javac
JVM
bytecode
A Java compiler translates Java programs to bytecode instructions for Java Virtual Machine

6.

Архитектура: многопроходный компилятор
Java
Parse
Type check
Optimize
Code Gen
JVM
bytecode
A Java compiler translates Java programs to bytecode instructions for Java Virtual Machine

7.

Промежуточные представления
Java
code
Parse
Abstract
syntax
tree
Type
check
Annotated
AST
Optimize
Transformed
AST
Code
Gen
JVM
bytecode
A compiler is a composition of a series of translations between intermediate languages

8.

Компоненты компилятора
Java
code
Parse
Abstract
syntax
tree
Type
check
Annotated
AST
Optimize
Transformed
AST
Parser (Lexical analyzer + Syntax analyzer)
• Reads in program text
• Checks that it complies with the syntactic rules of the language
• Produces an abstract syntax tree
• Represents the underlying (syntactic) structure of the program
Code
Gen
JVM
bytecode

9.

Компоненты компилятора
Java
code
Parse
Abstract
syntax
tree
Type
check
Annotated
AST
Optimize
Transformed
AST
Code
Gen
JVM
bytecode
Type checker (Semantic analyzer)
• Consumes an abstract syntax tree
• Checks that the program complies with the static semantic rules of the language
• Performs name analysis, relating uses of names to declarations of names
• Checks that the types of arguments of operations are consistent with their
specification

10.

Компоненты компилятора
Java
code
Parse
Abstract
syntax
tree
Type
check
Annotated
AST
Optimize
Transformed
AST
Code
Gen
Optimizer
• Consumes a (typed) abstract syntax tree
• Applies transformations that improve the program in various dimensions
• execution time
• memory consumption
• energy consumption
JVM
bytecode

11.

Компоненты компилятора
Java
code
Parse
Abstract
syntax
tree
Type
check
Annotated
AST
Optimize
Transformed
AST
Code
Gen
Code generator
• Transforms abstract syntax tree to instructions for a particular computer
architecture
• aka instruction selection
JVM
bytecode

12.

Компилятор = front-end + back-end
Back-end
Front-end
Java
code
Parse
Type
check
Annotated
AST
Optimize
Code
Gen
JVM
bytecode
A compiler can typically be divided in a front-end (analysis) and a back-end (synthesis)

13.

Компилятор = front-end + back-end
Back-end
Front-end
C
Parse
Type
check
LLVM IR
Optimize
Code
Gen
x86
A compiler can typically be divided in a front-end (analysis) and a back-end (synthesis)

14.

Компилятор = front-end + back-end
Front-end
C
Parse
Back-end
Type
check
Optimize
Code
Gen
x86
Code
Gen
Arm
LLVM IR
Front-end
C++
Parse
Type
check
Back-end
Optimize
Repurposing: reuse a back-end for a different source language
Retargeting: compile to different hardware architecture

15.

Типы компиляторов (1)
Compiler
• translates high-level programs to machine code for a computer
Bytecode compiler
• generates code for a virtual machine
Just-in-time compiler
• defers (some aspects of) compilation to run time
Source-to-source compiler (transpiler)
• translate between high-level languages
Cross-compiler
• runs on different architecture than target architecture

16.

Типы компиляторов (2)
Interpreter
• directly executes a program (although prior to execution program is typically
transformed)
Hardware compiler
• generate configuration for FPGA or integrated circuit
De-compiler
• translates from low-level language to high-level language

17.

Lexing & Parsing

18.

Лексический и синтаксический анализ программы
Text
Lexer
Tokens
Parsing
AST

19.

Пример контекстно-свободной грамматики

20.

Лексический и синтаксический анализ программы

21.

Пример для арифметических выражений

22.

Вывод

23.

Дерево разбора

24.

Пример для арифметических выражений
id * id + id
+
*
id
id + id + id
id
id

25.

Semantic analysis

26.

А нам точно нужен семантический анализ?

27.

Name analysis

28.

Validity

29.

Visibility

30.

Qualification

31.

Multiple namespaces

32.

Symbol table

33.

Граф потока управления
entry
void foo() {
int x = source();
if (x < MAX) {
int x = call (“source”)
int y = 2 * x;
sink(y);
}
x < MAX
true
}
int y = 2 * x
false
call (“sink”, y)
exit

34.

Code property graph
entry
void foo() {
int x = source();
if (x < MAX) {
int x = call (“source”)
int y = 2 * x;
sink(y);
}
x < MAX
true
}
int y = 2 * x
false
call (“sink”, y)
exit

35.

Семантический анализ
source
def
entry
void foo() {
int x = source();
if (x < MAX) {
int x = call (“source”)
int y = 2 * x;
sink(y);
}
x < MAX
true
}
int y = 2 * x
false
call (“sink”, y)
exit
sink
def
English     Русский Rules