"Reverse engineered" is a bit of a stretch. You can compile cuda with clang / llvm. LLVM also supports spitting out SPIR: OpenCL's intermediate language. While it may not be trivial to spit out SPIR in the backend from a CUDA frontend, it also probably does not involve a lot of "reverse" engineering.
And then there is this quote.
> While there is an independent GPGPU standard dubbed OpenCL, it isn’t necessarily as good as CUDA, Otoy believes.
CUDA colloquially refers to both the language and the toolkit NVIDIA supports. This quote does not mention which part he is talking about. The reason one might consider CUDA "good" is not because of the language (it is fairly similar to OpenCL), it is because of the toolkit. Implementing a cross compiler does not make the CUDA libraries (such as cuBLAS, cuFFT, cuDNN) portable. They are still closed source and can not be supported by this compiler.
Then there are issues with performance portability. Just because it runs on all the GPUs does not mean it is going to be good across all of them. This is a problem we constantly see with OpenCL as well.
This article reads like a PR post with little to no understanding of the GPU compute eco system.
The reason LLVM is so popular is not because of it's license, but it's modular architecture (see all the projects using LLVM). GCC is a monolithic compiler which is impossible to extend. If GCC is to remain competitive, RMS should consider a rewrite.
"Undefined behavior" doesn't mean that it doesn't work, it means that the compiler can do whatever it wants, which sometimes makes it work accidentally and sometimes makes it do something completely different. This article and the two following it are a good introduction to what it is, why it's dangerous, and why it seems like it works sometimes.
C++ also has this restriction, but in a weaker form. It is undefined behavior to mutate a value that was declared as const
(equivalent to Rust declarations without mut
), but fine to take a mutable value, make a non-mutable pointer to it, and then cast it back to mutable.
In Rust, constant vs. mutable references are passed to the code generator with information saying that they cannot point to the same thing (the "noalias" model). So if you take a function like fn do_thing(x: &i32, y: &mut i32) { *y += *x; *y += *x; }
, the Rust language says that a compiler is allowed to speed things up by only loading x
once instead of loading it twice just in case y
points to the same value.
I think this is one of the big problems which was mentioned by Doug Gregor in his talk on the C/C++ modules proposal when he was talking about the fundamental brokenness of headers
> I don't know how they developed this so fast
The LLVM paper is from '04, there was a technical report from '03, and Lattner's Master's thesis (also with LLVM in the title) is from '02.
It's been worked on for 10 years. Not to say that they didn't accomplish a lot in a short period of time, but they also didn't pop up over night.
> but I suspect some big whigs thought it'd be important to pool money into a compiler that wasn't drugged on a viral license like the GCC.
Apple hired Lattner after he finished his PhD. I don't know the details, but I'd guess that UIUC owns a good chunk of the IP. As far as I know, it's always been licensed under the UIUC BSD-ish license.
It was just a really, really good solution for a really important research problem. Turned out to be good enough to get Lattner a job with a fancy title and to "shake up" the compiler world a bit.
Ha! I have a bachelor in software engineering, worked as a compiler developer for 3 years. So, don't worry about it :) Nobody will look at your degree and wonder if you had any compiler courses, they will just be interested in what you know about compilers. Lucky for you, compiler devs are in very high demand, but you do need to know your stuff.
So, aside from that, don't write a parser, nobody cares :) Writing your own compiler would be very cool, or hack on LLVM/Clang. A large amount of the work revolves around using LLVM, and it enables you to only work on interesting bits.
As for low level stuff, it might be a bit harder but it's a wide topic. Are you talking low-level optimization? Systems programming? kernel-mode?
EDIT: by demand, a short list of positions
Every single semiconductor company has a need for a compiler, so the list of big companies to apply for are AMD/Nvidia/Qualcomm/Broadcomm/Imagination Technologies/Apple/Microsoft/Intel Smaller companies can be found by by going through the LLVM devmeetings, it lists the company the speaker is working for. That should be a nice list. Stalking the LLVM mailing list might be a good way to find people as well.
There's still a lot of polish to be done here, but it's ready to be tried out.
I met Kostya last week at wontfix_cabal and we talked about fuzzing in Rust. He works on Google's sanitizer/fuzzer team, and is currently working on libFuzzer, a library-based general fuzzer. Unlike AFL, which calls a binary with random input (which is then tweaked to try and get the program to crash), libFuzzer instead links to your code and calls a specific function with random input. A very nice thing about libFuzzer is that you don't need to recompile LLVM/rustc to make it work, it uses builtin sanitizer instrumentation support (and we already support sanitizers in Rust).
Take it for a spin, and please file bugs!
(Help fixing said bugs would be great, too, I was mostly focused on getting an MVP out)
One thing that is stable is #[cold]
. Any branch that would lead to a cold function is considered unlikely, so you could consider splitting the unusual path into a separate function.
If I recall correctly there is a misoptimization in LLVM that led to temporarily disabling sending all this aliasing information to LLVM. In addition, I don't think annotating pointers as non-aliasing is currently possible except for function arguments (see here).
Learning to read LLVM IR is a must - even after we start doing our own optimizations, we'll still be using LLVM primarily for at least one or two years.
The most important thing is probably to figure out what you can remove: the LLVM IR has lots of extra information, but is a RISC at heart, with pseudo-infinite registers.
Once you learn to scan through it for information you need, and maybe do diffs to see the effect of a code change in the IR, it can be easier than x86 assembly.
The LLVM "language reference" docs are good enough for understanding almost all of it, especially the more common bits.
There's a whole class of bugs like that - see description here: http://blog.regehr.org/archives/1161
Basically you find compiler errors by taking code like
bool func(int n) { if (n > 45) { return true; } return false; }
int main () { std::cout << func(46); }
and changing the code so that func just returns true.
In this case the output program should be identical, and when it's not it's a compiler bug (usually an optimisation bug).
http://llvm.org/bugs/show_bug.cgi?id=18447 for example
TL;DR compilers are amazing, compiler errors are rare and crazy
The benchmark code isn't very realistic.
The Itanium ABI defines a zero-cost exception model. If course, it's not actually zero cost because increased generated code size means less space in the cache line for actual code. In addition, there's a substantial branching cost when the exception is thrown, but that should of course be rare. Dealing with return codes isn't free either, but are easier to reason about in terms of cost.
The LLVM tools, for instance, do not use exceptions because of the overhead in real-world compilers (http://llvm.org/docs/CodingStandards.html#do-not-use-rtti-or-exceptions)
That said, for the vast majority of applications, it's a non-issue.
> Try swapping out msvc for clang while still keeping the rest of visual studio intact.
My plan for next year is Scala -> LLVM -> GLSL
Priorities:
Clang/LLVM emits compatible C++ EH handlers, this was done as part of r233767: http://llvm.org/viewvc/llvm-project?view=revision&revision=233767 but more work is ongoing.
Compatible SEH handlers for x64 have been implemented (see r235154: ~~http://llvm.org/viewvc/llvm-project?view=revision&revision=233767~~ http://llvm.org/viewvc/llvm-project?view=revision&revision=235154) but work is still ongoing. 32-bit x86 is also ongoing.
EDIT: fixed link thanks to spongo2!
Of interest is the LLVM StringSwitch class which allows to do this:
Color color = StringSwitch<Color>(argv[i]) .Case("red", Red) .Case("orange", Orange) .Case("yellow", Yellow) .Case("green", Green) .Case("blue", Blue) .Case("indigo", Indigo) .Cases("violet", "purple", Violet) .Default(UnknownColor);
-- http://llvm.org/docs/doxygen/html/classllvm_1_1StringSwitch.html
Unlike the proposed method this class does not need all these allocations of std::function, std::vector, std::unordered_map.
Ugh I hate these. I spent two weeks chasing a backend bug in my lisp DSL interpreter and finally I found out if I didn't strip one of my libraries, it would not occur. I then found if I compiled it as -O2 instead of -O3, it would not occur. The only difference with clang at the moment is the -argpromotion option. I've actually still not figured out where the bug is in this, but I'm just compiling as -O2 for now.
It's bugged right now to retest when we move to clang 5 in a few months.
rustc currently uses the default set of LLVM passes, which are generally designed for C/C++. Rust is fortunately pretty close to them in terms of core functionality, with the exception of things like bounds checks: they aren't particularly common in those languages and hence the passes haven't been chosen to ensure they're removed. However, the LLVM authors recognise this and give some hints for improving:
> One of the most common mistakes made by new language frontend projects is to use the existing -O2 or -O3 pass pipelines as is. These pass pipelines make a good starting point for an optimizing compiler for any language, but they have been carefully tuned for C and C++, not your target language. You will almost certainly need to use a custom pass order to achieve optimal performance. A couple specific suggestions:
> 1. For languages with numerous rarely executed guard conditions (e.g. null checks, type checks, range checks) consider adding an extra execution or two of LoopUnswith and LICM to your pass order. The standard pass order, which is tuned for C and C++ applications, may not be sufficient to remove all dischargeable checks from loops. > 2. If you language uses range checks, consider using the IRCE pass. It is not currently part of the standard pass order.
(It of course depends on what the crates implementing the indexing are doing internally.)
According to http://llvm.org/svn/llvm-project/libcxx/trunk/include/array, it looks like most of the code is boilerplate for stuff like comparison operators, tuple support and every possible combination of const/non-const begin()
and end()
iterators/reverse iterators.
> > LLVM was supposed to be a low-level, relatively-language-agnostic compiler IR > What? The very type system is specialized for C.
I quote: > LVM defines a common, low-level code representation in Static Single Assignment (SSA) form, with several novel features: a simple, language-independent type-system that exposes the primitives commonly used to implement high-level language features; an instruction for typed address arithmetic; and a simple mechanism that can be used to implement the exception handling features of high-level languages (and setjmp/longjmp in C) uniformly and efficiently.
Later : > First, LLVM has no notion of high- > level constructs such as classes, inheritance, or exception- > handling semantics, even when compiling source languages > with these features. Second, LLVM does not specify a > runtime system or particular object model [...]
Compare with the blog post: > The landingpad instruction specifies the personality function the Exception Handling runtime uses, a list of types which it can catch (int), and a list of types which foo is allowed to throw (const char *).
LLVM is an intermediate representation for compiled code. Like Java or .NET CLR, except it's designed to be more comparable to actual hardware (hence the LL, for low-level) but at the same time not being tied to any one machine (hence the VM, for virtual machine).
It was originally designed as a new backend for GCC, and can be used in that fashion, but is now typically used with newly written LLVM frontends such as clang. LLVM+clang together form a complete C compiler, independent of GCC (both in code and in copyright) and much more liberally licensed.
Of course, it also works with any other language front-end that writes LLVM output. Apple even uses it to compile OpenGL shaders.
You want projects using LLVM? Here ya go!
> I don't think of that amount of gain as typical for a language like C. If it's typical for Rust, that's interesting.
Rust knows way more about your code than C does, and so we can be much more aggressive with our optimizations.
One great example of this is that every &mut
can basically be the equivlanet of restrict
in function arguments in C, as we know there's no aliasing going on. See noalias
in http://llvm.org/docs/LangRef.html
> If that gain is coming mostly from "stuff that LLVM is doing", rather than "stuff that rustc is doing before LLVM gets its turn", that's even more interesting.
It currently is. We've mostly been focused on language semantics, and feed LLVM sub-optimal IR all the time.
and of course, actually using checked arithmetic is probably the only way we'll finally have hardware support for checked operations instead of having to call seto
like savages.
FWIW Rust's checked_* uses LLVM's overflow intrinsics which should be as efficient as can be e.g. libo claims overflow_mul(int *, int, int)
compiles to
imull %edx, %esi movl %esi, (%rdi) seto %al ret
which should be the minimal implementation.
LLVM doesn't have a GC. It's basically a code generator (that can work as a JIT or a standard ahead-of-time compiler), and not much else. It's assumed that the GC is part of the language runtime, which LLVM doesn't care much about.
The LLVM intermediate representation does support some GC-related operations though. So, you can make LLVM generate code that ties into the runtime's VM.
http://llvm.org/docs/GarbageCollection.html
VMKit (which is built on top of LLVM, and provides services you can use to build typical VMs on, including GC) uses Jikes' GC. They have a JVM implemented on top of that.
LLVM has had a basic partial inlining implementation for a while (I wrote it). It didn't prove to be a particular performance win, though maybe finer tuning would help.
> What about Python?
Let me Google that for you https://docs.python.org/3/license.html
> All Python licenses, unlike the GPL, let you distribute a modified version without making your changes open source.
.
> LLVM?
Let me Google that for you, as well http://llvm.org/docs/FAQ.html#license
> Can I modify LLVM source code and redistribute the modified source?
> Yes. The modified source distribution must retain the copyright notice and follow the three bulleted conditions listed in the LLVM license.
> Can I modify the LLVM source code and redistribute binaries or other tools based on it, without redistributing the source?
> Yes. This is why we distribute LLVM under a less restrictive license than GPL, as explained in the first question above.
Oracle is given the rights, according to these licenses, to do what they do. Just because someone has a more permissive license, doesn't mean you can assume everyone has a permissive license, or must offer one. Oracle is entirely consistent.
LLVM is the engine of many programming language compiler. Clang is the most known front end for LLVM which is a C/C++/ObjC/Swift compiler. Clang/LLVM toolchain is used in PS4 also. Clang recently got a Visual C++ compatible mode so you can use it for Windows platform development also so you can link directly programs to Visual C++ Redistributable Packages.
I think at this moment not so many program compiled with LLVM/Clang to Windows but in the future it can change as Clang is developed by Apple/Google/Intel... so the big ones.
A jump table is generated in both Rust and C++ if it meets certain criteria.
https://gist.github.com/mehcode/e8f4e7c5b7eb7a3bf5a004a9983172c4
That example does generate a "switch" which is key because LLVM has switch lowering to jump table, under the right conditions.
http://llvm.org/devmtg/2015-10/slides/Wennborg-SwitchLowering.pdf
If I change my example to use enum's the code generation is no different to answer OP's question.
> breath of fresh air in the compilers
I've used it for almost 13 years now (I put a compiler based on LLVM 1.1 in production in 2003!)
Feels more like an old buddy.
I love how they stick to their style and principles. For instance, the 2004 version of the home page is pretty much the same as they have now. It just happens to expand with new libraries and tools from time to time: https://web.archive.org/web/20040503004454/http://llvm.org/
> Am I seeing results that are typical?
Absolutely.
> If so, can anyone share some insight into why -O is so different?
Well, I mean, turning on optimizations is just going to make things faster, by their nature. There's all sorts of things that can be done. You can read up here: http://llvm.org/docs/Passes.html (or maybe someone else more familliar with LLVM can give you a better link)
I guess the question boils down to whether LLVM has a fence primitive that doesn't get emitted. I see in the LLVM language reference that the fence
instruction has a singlethread
variant which looks like it does exactly what you want.
As far as making LLVM actually emit that IR, it looks to me like rustc isn't currently capable of doing so. librustc_llvm
contains a function LLVMBuildAtomicFence(LLVMBuilderRef B, AtomicOrdering order)
, which appears to handle all of the fence intrinsics. Inside LLVM, an optional scope can be specified to make the fence singlethreaded. Since rustc doesn't currently have this exposed, it looks like it will only emit multithreaded fences.
Adding this support looks like a pretty easy patch, fortunately.
From the Clang release notes:
> * Clang uses the new MingW ABI GCC 4.7 changed the mingw ABI. Clang 3.4 and older use the GCC 4.6 ABI. Clang 3.5 and newer use the GCC 4.7 abi.
(Capitalization has something to be desired there; MinGW is capitalized two ways, as is ABI — presumably this is what /u/happyscrappy was complaining about.)
Clang and LLVM libc++ also looks to be using a growth factor of 2, so it's not just GNU libstdc++ "staunchly" using it.
http://rextester.com/LXCBR62665
Line 941: http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/vector?revision=215332&view=markup
If you're willing to give up purely contiguous storage, but want chunky allocation and O(1) random access, then there are other interesting data structures out there:
"Optimal Resizable Arrays in Space and Time" for example, details a data structure that gives you an amortized constant time push_back, which never invalidates iterators, a cache-friendly layout, and smooth allocation which wastes at most O(√n) memory during reservation. In this particular data structure, using a block size factor of 2 means superblock and superblock and block mapping can be performed using bitmasks and shifts.
We don't have any plans to produce standalone, living reference documentation. We have a few high-level things, but I expect they will grow stale very quickly: http://llvm.org/docs/PDB/index.html http://llvm.org/docs/SourceLevelDebugging.html#codeview-debug-info-format
What we do have that should help a lot is functional dumpers for the format. If anyone wants to develop new tools, they should be able to use llvm-readobj and llvm-pdbutil to validate their output and try to understand what records work in the debugger.
I nailed down to the exact optimization pass that solved the recurrence. It is indvars. I was expecting some kind of pattern-matching, but apparently there is none, I still don't understand how it works :).
For reference, minimum sequence of passes that kill the loop:
clang++ -S -emit-llvm test.cpp -o /dev/stdout | opt -S -mem2reg -indvars
Unfortunately, this model is too weak to allow for some important optimizations. Consider a function like this:
fn f(a: &mut [u32], b: &[u32]) { for (p, q) in a.iter_mut().zip(b.iter()) { *p += *q; } }
The iterator expression returns a pair of &mut u32
and &u32
, which means that the accesses in the main statement *p += *q
can be assumed to not alias. This means that for every iteration of the loop, a[i]
is distinct from b[i]
. However, there's nothing that lets you determine that a
and b
are entirely disjoint, unless you add knowledge of slice types to your optimizer (which LLVM obviously doesn't have). Therefore you can't vectorize this loop independently of the callers of f
without adding a dynamic aliasing check.
There's a deeper problem in that LLVM's alias.scope
feature is too strong to model Rust's weaker aliasing guarantees. If you use alias.scope
annotations to indicate that two memory operations don't alias, it appears to ignore intraprocedural control dependencies, e.g. it applies across loop iterations. This means that you can't actually use alias.scope
in this situation:
impl Context {
fn pair(&mut self) -> (&mut u32, &u32) {
...
}
}
fn f(c: &mut Context) { loop { if ... { break; } let (p, q) = c.pair(); *p += *q; } }
If you put alias.scope
annotations on the memory operations in the statement *p += *q
, then you are telling LLVM that there is never a memory dependency between the store to *p
and the load of *q
, but that isn't true for every potential implementation of Context::pair
. It could be that the LLVM LangRef is missing some intended details here, but I think I'm reading it correctly.
I don't think that LLVM code is that elegant: http://llvm.org/docs/doxygen/html/ErrorOr_8h_source.html
It seem to have much more boilerplate than Rust: https://doc.rust-lang.org/src/core/result.rs.html#245-253
"GHC uses a portable garbage collector, implemented in the runtime system, that requires no explicit support from the backends" [such as LLVM].
Some additional research turned up this Rust reference page, which states that the following behaviour is undefined: > * Breaking the pointer aliasing rules on accesses through raw pointers (a subset of the rules used by C) > * &mut T and &T follow LLVM’s scoped noalias model, except if the &T contains an UnsafeCell<U>. Unsafe code must not violate these aliasing guarantees.
I've tried reading the linked LLVM documentation, but I don't really understand it. Can anyone else make sense of it?
I wrote two of the things he linked to: the Swift Kaleidoscope implementation and LLVM.swift. Before starting those about a month ago, I had no idea how LLVM worked. I definitely recommend the Kaleidoscope tutorial to anyone remotely interested in learning a bit about their compiler.
> instead of using llvm's bitcode (which is architecture independent)?
That is not true. See the LLVM FAQ:
http://llvm.org/docs/FAQ.html#can-i-compile-c-or-c-code-to-platform-independent-llvm-bitcode
Well, there are fairly big projects that ban static objects with non-trivial constructors (or destructors). There are, thus, good reasons to prefer a singleton.
However, I suspect that you're actually mistaken about not needing an instance.
Comments are useful narration. See for example a file like this:
// == if (test_getf2(0x1.234567890abcdef1234567890abcp+3L, 0x1.234567890abcdef1234567890abcp+3L, GREATER_EQUAL_0)) return 1; // > // exp if (testgetf2(0x1.234567890abcdef1234567890abcp+3L, 0x1.234567890abcdef1234567890abcp-3L, GREATER_EQUAL_0)) return 1; // mantissa if (test_getf2(0x1.334567890abcdef1234567890abcp+3L, 0x1.234567890abcdef1234567890abcp+3L, GREATER_EQUAL_0)) return 1;
These minimalist comments are in the "what" tradition, yet they're still really helpful. No way you can look at that code and instantly locate, say, the mantissa comparison test, except for the comments! And I defy you to "clean that up" in a way that doesn't negatively impact the code, e.g. by adding more indirection you have to hunt through.
I think mrgoldenbrown means licensing the compiler chunks within another tool you've developed. GCC uses a more restrictive license.
> Can I modify the LLVM source code and redistribute binaries or other tools based on it, without redistributing the source?¶
> Yes. This is why we distribute LLVM under a less restrictive license than GPL, as explained in the first question above.
The shittiest one should be less shitty now as of this commit: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.4/tools/clang/docs/ReleaseNotes.html?r1=198098&r2=198630&pathrev=198630&diff_format=h
(maybe pending regeneration from sphynx)
The main LLVM page says:
> Jan 6, 2014: LLVM 3.4 is now available for download! LLVM is publicly available under an open source License. Also, you might want to check out the new features in SVN that will appear in the next LLVM release.
But it seems like for now 3.4 isn't showing up in the downloads section, presumably it'll be very very soon.
The LLVM release notes are here.
An aside that I thought was interesting in the article > Rust’s calling convention means there is a lot more setup. Of particular note was the panic unwind. LLVM does a tail call optimisation, so instead of a call, it does it’s cleanup and a jump. That means that when baz returns, it returns directly to foo’s caller rather than foo. It’s a neat optimisation and saves on stack space and instruction calls. However, that does mean that all the code after that is unused. This looks like a bug, albeit not a serious one, either in LLVM or in the rustc backend.
This is not a bug, AFAIK. This is how exception handling is implemented in LLVM (and C++). The exception handler code is stored after the return statement and is only jumped to by the exception handling routine. See this link for more information.
EDIT: I found the first link here. I recognized the assembly pattern from looking at a lot of generated C++ code and wanted to double check that Rust used the same exception handling structures before making a post here. There are other supplemental links in the source file linked in this edit that may be handy but I wanted to also share insight as to how I came to this conclusion.
As with all things, you must weigh the good with the bad. You may not like this single add-on, but you also have to consider all the compiler research that LLVM makes possible (and this list is certainly incomplete). Much of this simply isn't feasible with GCC since it's intentionally designed to be uncooperative to extension/add-ons. This includes my own security research.
http://llvm.org/releases/3.8.0/tools/clang/docs/ReleaseNotes.html
Note that this check is mostly effective on C structs. Inheritance confuses things. There are also options to configure that checker to change your threshold of padding. The default is pretty conservative. I'm the author of that checker.
This blog talks about missed vectorization opportunities: http://blog.llvm.org/2014/11/loop-vectorization-diagnostics-and.html
I am not aware of any analyzers / checkers that will help with aliasing / restrict usage.
What's the story about GC roots in machine registers? I've the impression from the LLVM calling conventions doc that there's no support for passing pointers in registers at all. Can we have them there though at other times?
Definitely! /u/brson has probably looked at this sort of thing the most.
https://internals.rust-lang.org/t/some-notes-on-reducing-monomorphizations/2459
and
http://llvm.org/docs/Passes.html#partial-inliner-partial-inliner
are relevant early work on this sort of thing.
First you need a place to store it, assuming that's on the stack, use an alloca:
%o = alloca %Object, align 8
Now to store into the i8 in it, you have to get the element pointer (GEP) for it, and store into that like so:
%gep0 = getelementptr inbounds %Object* %o, i32 0, i32 0 store i8 97, i8* %gep0, align 1
To store into the double, that would be:
%gep4 = getelementptr inbounds %Object* %o, i32 0, i32 4 store double 1.0, double* %gep4, align 8
Loading from the i8 looks like this:
%v = load i8, i8* %gep0
You can read about GEPs here: http://llvm.org/docs/LangRef.html#getelementptr-instruction
> Is it currently feasible to write a modern precise copying GC on top of LLVM?
It is now, thanks to some awesome work by Azul: http://llvm.org/docs/Statepoints.html
(BTW, I'd love to use this in Servo for JS garbage collection: it would potentially be a major advance in performance of the DOM.)
> It's also possible that we'll see an assembly back-end for Nim soon. I'm pretty sure I read about someone working on one in the Nim IRC or forum. That doesn't invalidate your point about the negative aspects of compiling to C, of course.
Would the assembly backend do IR-level optimizations?
Start with the Dragon Book.
When it actually comes time to implement the language, I would recommend just writing the frontend and reusing the backend from another compiler. LLVM is a good option (it's becoming popular to use as a backend, it now has frontends for C, C++, Objective C, Java, D, Pure, Hydra, Scheme, Rust, etc). See here for a case study on how to write a compiler using LLVM as the backend.
No, it's not a hack. There is no such thing as a "TypeName" Token, and nor should there be, since in nearly any language (Unless the language prefixed all types with a @ or some such) the tokenizer would need to keep a huge amount of context of surrounding tokens, and know more about the structure of the language than any lexer should have the right to care about (That belongs in the parser)
With Clang for instance, notice in the article, this fragment from the lexer stream:
int 'int' [LeadingSpace] Loc=<z.c:1:9> identifier 'mytype' [LeadingSpace] Loc=<z.c:1:13>
Notice how 'int' is of the 'int' token type, and not some sort of built-in data type, that's due to the fact that in C++, int is a keyword, not due to it being in the structure of a type.
In fact, the only reason that they are tokenized like that, and not like a standard identifier is because in C++, keywords have special behavior.
From the C++ standard C++ keywords "This is a list of reserved keywords in C++. Since they are used by the language, these keywords are not available for re-definition or overloading." Hence, in C++, an unexpected keyword when a identifier is expected, is an error. (I wish the actual specification document was available, since it specifically defines this as an error IIRC)
Additionally, you'll see in the Clang token definition file that there is nearly zero tokens for language aspects of structure, and that structure holds true for the tokenizers of most modern compilers. The tokenizer is simply not the place for classifying language aspects of structure, such as types, variables, functions, or anything of the such.
LLVM and Gcc that are needed for MacPorts will always be free. Here’s GCC. And LLVM.
It is just the IDE that costs.
Not sure why you think the GPL protects you from patent issues. Hint: the companies wanting to pursue litigation against open source software aren't going to be GPL3 distributors.
>Well, then don't create derivative work
There is cget which can install directly from the source tarballs(ie cget install http://zlib.net/zlib-1.2.11.tar.gz
) or from github(ie cget install jgm/cmark
). It can also install binary tarballs as well with cget install -X binary http://llvm.org/releases/3.9.0/clang+llvm-3.9.0-x86_64-linux-gnu-ubuntu-16.04.tar.xz
.
Its cross-platform and builds a local environment to install packages. There is also recipes for libraries that don't provide a requirements.txt file yet.
No, if you'd followed the compiler link you would have seen the words "open source". Google open sourced their CUDA compiler a while after the initial TensorFlow release, and it's now a part of clang. My point was that it wouldn't make sense to go and write a CUDA compiler if they were just calling CuDNN (which is distributed as pre-compiled binaries) for everything.
I'm not sure how the behaviour of having an sNaN value is relevant to this particular piece of UB, given the UB applies to perfectly normal values too.
In any case, the LLVM constraints for undefined behaviour with float casts are fairly unambiguous and, significantly, not platform dependent:
> The ‘fptosi‘ instruction converts its floating point operand into the nearest (rounding towards zero) signed integer value. If the value cannot fit in ty2, the results are undefined.
This says that, for instance, 127.99999999999999_f64 as i8
and -2147483648.9999995_f64 as i32
are fine, but 128_f64 as i8
and -2147483649 as i32
are not. Some platforms might handle the failing cases "sensibly" (e.g. reduce the infinitely precise result modulo the integer's width), but that's not entirely relevant to what the language regards as UB and/or traps on.
The only two questions I can see are that it doesn't explicitly:
1e308 as i32
and INFINITY as i32
to behave differently (and, even if the latter was fine for LLVM, Rust can impose stricter semantics); nor They have removed C backend some time ago. There was a person on a mailing list who wanted to revive it, but I think it went nowhere.
Update: C backend was removed in LLVM 3.1:
> The C backend has been removed. It had numerous problems, to the point of not being able to compile any nontrivial program.
And the revival effort is probably this: https://github.com/draperlaboratory/llvm-cbe Last commit June 2015
Update 2: There is a newer fork of that backend that seems to be somewhat supported. Declared to be compatible with LLVM 3.7.
One of the main reasons that it's so horrendous is that there are very few good tutorials around for LLVM which are also up-to-date. Their official tutorial mostly does things without explaining them, or heavily overexplains things that matter less while glossing over the actual concepts that matter without even mentioning the fact that you're generating IR for later processing.
Even their official optimizer is built on the llvm::legacy::* interfaces, so I couldn't find a way to get all the standard optimizations for the modern PassManager without reading every single class to see which is an optimizing pass.
It's pretty simple when you figure it out, I just wish there was one document out there that actually explained how to use the API or what most things do.
You build IR through the API using a context. You can use a Builder object to help manage this more easily in most cases. You then pass the IR through optimizers and other passes (including one to spit out IR, or to generate and spit out assembly, or to generate an object into a file) and run it all with a run
function. It's really that easy. That statement alone would have saved me a lot of fiddling if it was simply in any of the tutorials (for instance, I didn't know if the Builder was necessary, or if it would break while interworking with non-builder module building, or if LLVM could build a native object or anything like that, or how optimizations are done). They even explain Passes in terms of how to define your own passes and use them, but never touch on how to use a single modern built-in one outside of the legacy interface.
I like LLVM, but I can't find a simple and easy way to learn the API on the internet. There might be a good book somewhere, but if there is, it's probably already out of date.
Are you using OS X for Swift development? If so, you could give it a shot using the Time Profiler instrument in Xcode.
> I know this probably makes only a minor difference but I would feel better knowing it in detail :)
Then this little bit of advice probably isn't what you're looking for, but my thought is that it probably doesn't matter – at least until you notice a performance impact.
In JS, certain things can make an impact because it can be difficult to squeeze performance out of a single-threaded in-browser runtime. In a lower-level language like Swift, there are usually more abstract optimizations (at least in application code).
If you wanted to get into the real nitty-gritty (for fun and learning), you could try compiling a simple Swift file with certain flags, comparing the compiler's output (from highest to lowest level):
-emit-sil
to see what the difference is the Swift Intermediate Language-emit-ir
to see what the difference is in LLVM IR-emit-assembly
to see the difference in actual assemblyAlso, I'd bet that you'll see big differences depending on the type of the constant/variable. Is it a struct that never escapes the local scope and can (though is not guaranteed to) be allocated on the stack? Or is it a class instance that needs to be allocated on the heap and deference-counted?
At the end of the day, there are many variables (no-pun-intended), which is why I'd say don't worry too much about it until you have a performance problem. But I'd love to hear what you find from an academic perspective!
Last time I tried to do this tutorial on Windows I couldn't compile the LLVM libs in visual studio, but even just the parts about the lexer and parser are pure gold if you haven't done anything like this before.
Keep in mind though, this tutorial is written in a very "get things done" fashion. In a serious project you would have to write a little more boilerplate code.
To make the "contiguous in memory" thing a little bit more explicit for people who may not know. SIMD means that e.g. an integer multiplication is done on 4 values at a time. However, these 4 values must be adjacent in memory. So, a struct with two integer fields will not be converted into SIMD when you're only adding the first field of different elements of an array. There are SIMD instructions that support this (Scatter / Gather) but by default LLVM does not use these. Easiest by far is to stick to simple arrays. The Auto-Vectorization page from the LLVM documentation contains loads of interesting information on SIMD.
From the reference:
>The following is a list of behavior which is forbidden in all Rust code, including within unsafe blocks and unsafe functions. Type checking provides the guarantee that these issues are never caused by safe code.
>
> ...
>
> * Breaking the pointer aliasing rules with raw pointers (a subset of the rules used by C)
> * &mut
and &
follow LLVM’s scoped noalias model, except if the &T
contains an UnsafeCell<U>
. Unsafe code must not violate these aliasing guarantees.
> * Mutating non-mutable data (that is, data reached through a shared reference or data owned by a let
binding), unless that data is contained within an UnsafeCell<U>
.
It looks like you are trying to go from &
to a mutable pointer without using UnsafeCell<U>
(or some other wrapper that uses UnsafeCell<U>
), which seems to me to fall under the undefined behavior described above.
When Rust talks about memcpy
on copy/move, it's talking about the LLVM intrinsic, not the libc function. LLVM's optimization passes can take care of it no problem.
If you want to code in C and don't want to deal with microsoft's compiler, but still use the IDE, the easiest way is download and run the Windows installer from http://llvm.org/builds/
Then you can select the LLVM toolchain in the project options. The LLVM toolchain works pretty well exceptions are still a work in progress. Given that you do not need them in C, you should be fine.
You will then be able to use the IDE for building your coding, debugging, and building just as you would normally, but the compiler will be clang.
When using #include, the compiler has to parse and compile the code everytime. With modules, the header is "pre-compiled", i.e. the AST is generated and saved somewhere and so can be re-used everytime it encounters it, saving a lot of compilation time. In additional, it also allows for more granularity, you don't have to import a whole header if you only need a few functions/classes from it. See this presentation.
>Clang errors on builtin enum increments and decrements.
>
> enum A { A1, A2 };
> void test() {
> A a;
> a++;
> }
>
> returns error: must use ‘enum’ tag to refer to type ‘A’
This has to be the shittiest diagnostic that Clang can produce.
> -Wuninitialized
now performs checking across field initializers to detect when one field in used uninitialized in another field initialization.
>
> class A {
> int x;
> int y;
> A() : x(y) {}
> };
>
> returns warning: field ‘y’ is uninitialized when used here [-Wuninitialized]
Good stuff! Now I know better but two years ago this one got me.
The overflow builtins are pretty cool too, since a lot of people seem to wonder how to check for overflows.
> clang-cl provides a new driver mode that is designed for compatibility with Visual Studio’s compiler, cl.exe. This driver mode makes Clang accept the same kind of command-line options as cl.exe. The installer will attempt to expose clang-cl in any Visual Studio installations on the system as a Platform Toolset, e.g. “LLVM-vs2012”. clang-cl targets the Microsoft ABI by default. Please note that this driver mode and compatibility with the MS ABI is highly experimental.
Awesome.
Did you know that exceptions in C++ are faster than the equivalent if
code... in case there is no exceptions ?
This is called the Zero-Cost Exception Handling.
Not only does it saves instructions, it also cleans up the code (and thus the L1 instruction cache!) so that exception handling code is stored out of band in totally different memory pages.
Of course, one might argue it penalizes the exceptional path (by a factor of 10x or 20x actually). But... who cares about the cost of the exceptional path!!
FYI the if
approach to handling exceptions was used long ago. But all compilers (apart perhaps from VC++, ah....) migrated away from it in favor of the Zero-Cost handling because... it's faster where it matters.
Here is another approach to kickstart your compiler: use Gold Parser Builder to build your grammar file, then use one of the available Gold Parsing engines to transform the code to an AST. Then use LLVM to create optimizations, do constant folding, create the object code etc.
Some useful links:
> Second, they don't own the copyright anyway and it's not "their" project - its a project of the University of Illinois and they have the copyright.
Actually this is incorrect, and Chris Lattner, would probably scoff your ears if he ever heard that :) It was accidentally written in some page on the website but when someone remarked it he immediately changed the wording.
From the Developer Policy:
> The LLVM project does not require copyright assignments, which means that the copyright for the code in the project is held by its respective contributors who have each agreed to release their contributed code under the terms of the LLVM License.
> On the other hand, LLVM specifies that the fcmp instruction always returns an ‘i1′ value (a one bit integer). The problem with this is that Kazoo wants the value to be a 0.0 or 1.0 value. In order to get these semantics, we combine the fcmp instruction with a uitofp instruction. This instruction converts its input integer into a floating point value by treating the input as an unsigned value. In contrast, if we used the sitofp instruction, the Kazoo ‘<' operator would return 0.0 and -1.0, depending on the input value.
This tutorial sounds suspiciously like the Kaleidoscope tutorial that the LLVM project has, only in Ruby. Still, excellent writeup. :)
LLVM does tricks like this all over the place with the low bits -- they have a class called PointerIntPair for cramming small integers into the zero bits of an aligned address, which is useful for cacheline density.
Chandler Carruth explains in this CppCon talk.
They've even generalized this overloaded storage concept by applying it to anything they call a Pointer-Like Type in which trailing zeros are expected.
LLVM has piles of documentation on the subject of their subset of C++. The root of that is here. It wont teach you C++ though. I recommend reading that document and going to find some common components like the frontend, the parser, or SIL, etc. and seeing how it’s applied.
I learned C++ from reading Bjarne’s book and the standard which I recognize is not going to work for everybody. If you’re more of a visual/auditory learner, then talks at BoostCon/CppCon, C++ Now, and on MSDN are a very valuable open resource. If you’re more of an experiential learner, I recommend struggling with a [Starter Bug](goo.gl/AnmGTo) and picking up a feel for C++ as you go along.
Had the same experience rewriting some socket code. Especially combined with a class like LLVM's ErrorOr or Expected error handling in C++ just became a whole lot nicer.
http://llvm.org/doxygen/classllvm_1_1ErrorOr.html https://weliveindetail.github.io/blog/post/2017/09/06/llvm-expected-basics.html
> Unfortunately, getting clang to compile MSVC based projects isn't as easy as just dropping in clang and changing a few flags. Let's get started.
Why not ? I don't know why you need extra scripts for compilation. You can just install llvm (from here: http://llvm.org/builds/) and in your Visual Studio c++ project properties switch to LLVM toolset. I use it with VS2015 and everything works well.
I would like to see working clang tools on windows (especially address sanitizer). It would be amazing. Also there is still lack of libraries support for clang compiler on windows.
Why?
Here's a short tutorial on writing your own frontend/lexer using a fake language "Kaleidoscope" that looks a lot like python.
# Compute the x'th fibonacci number. def fib(x) if x < 3 then 1 else fib(x-1)+fib(x-2)
# This expression will compute the 40th number. fib(40)
Note that the above is not really "systematic", at least not in the way that I'd use it. It's automated, but for systematic you need something more. The gold standard is a formal specification of the language and a way to verify that the compiler implements such a specification. This paper gives a good example of one such effort. Other efforts involve code coverage metrics, documentation requirements, and traceability controls, such as those specified in DO-178B or ISO 26262.
This stuff is complicated and hard and expensive, so a free open-source project can be forgiven for testing in a somewhat ad-hoc way (and I'm sure there's more rigor behind the scenes that I'm not aware of). That said, there is an ongoing effort to make LLVM more usable in safety-critical systems, and an interesting talk was given at EuroLLVM last month.
All that info might have been overkill.. whoops, oh well.
Yep. It's definitely a mediocre university. That's why its students and professors are responsible for the creation of LLVM. Oh, and that touch screen that allows you to interact with your smartphone? Some of the touchscreen's pioneers were at UIUC.
It doesn't necessarily need to be packed as bits, but yes. In general it would have a set of flags, and the actual storage isn't very important.
I looked at what Clang's libcxx does, and it does pack bit flags into a single field (pulling just the relevant bits out):
typedef T2 iostate; static const iostate eofbit = 0x2;
iostate _rdstate;
bool ios_base::eof() const { return (_rdstate & eofbit) != 0; }
I wonder how this relates to Swift SIL's "basic block arguments", and for that matter LLVM's phi nodes? Is it something more specific?
Did you used exactly the same versions of MinGW and LLVM that the author recommends ? While LLVM 3.7.0 release candidate and previous versions were compiled with MinGW, latest versions of LLVM were compiled with Visual Studio 2015 and Clang will search for the standard header files (like stdio.h) in the default location for Visual Studio.
A possible workaround:
or
When you want to use Clang, open Visual C++ 2015 x64 Native Build Tools Command Prompt or Visual C++ 2015 x86 Native Build Tools Command Prompt and use the Clang compiler as usual.
Like this. Simple examples include constant propagation and loop unswitching.
You can have all this now. There is an unofficial mesa-git repo that has been working great here.
[parker@x3720 ~]$ llvm-ar --version LLVM (http://llvm.org/): LLVM version 3.9.0svn [parker@x3720 ~]$ glxinfo | grep -i version server glx version string: 1.4 client glx version string: 1.4 GLX version: 1.4 Version: 12.1.0 Max core profile version: 4.3 Max compat profile version: 3.0 Max GLES1 profile version: 1.1 Max GLES[23] profile version: 3.1 OpenGL core profile version string: 4.3 (Core Profile) Mesa 12.1.0-devel (git-29f53d7)
EDIT: Forgot the link https://wiki.archlinux.org/index.php/unofficial_user_repositories#mesa-git
Author of that article here. Testing to figure out what works and what doesn't proved to be time consuming and difficult. I'm still not entirely sure, and it's all subject to change at any time. (The joys of Windows development.) I tried installing different things in different orders in a freshly setup VM before building a few test programs. On Windows 7, I had the same error when installing just the SDK, but Clang worked correctly when I installed the full build tools (which includes the SDK).
Then install Clang 3.8 after all that's installed.
> In the case of corporate open-source, I really don't think most of them would feed those contributions back if they weren't legally obligated to.
If they didn't want to give modifications back, they could just use BSD and yet Linux is the more actively developed OS.
> I mean, why give your competitor an advantage?
In the end capitalism is about money. Sharing your code with your competitors means less maintenance costs. At a LLVM/Clang conference Sony developers talked about their use of LLVM/Clang in PlayStation 4's SDK. Just read pages 26–28 of http://llvm.org/devmtg/2013-11/slides/Robinson-PS4Toolchain.pdf
The developers clearly weren't happy that they had to work in secret. Not sharing the code meant that they had to stem the burden themselves. In the end is cost Sony more to work in secret than sharing the code (and we're talking about a BSD-licensed project here).
Copyleft licenses became so popular even in the corporate space because it is not a one-way street. No competitor can just screw you over (not legally at least). If it's beneficial for you to share the code and drive costs down, you might as well hold your competitors to the same standards and release our code under a copyleft license.
Syntax extensions operate upon the compiler-internal AST, not the emitted LLVM IR. It sounds to me like what you want is to write a custom LLVM pass, which can be done without (AFAIK) patching the compiler at all: http://llvm.org/docs/WritingAnLLVMPass.html#registering-dynamically-loaded-passes
Alternatively, if your idea is the sort of thing that should be a part of the language itself, then I would encourage you to write up and submit an RFC for adding functionality to the language/compiler.
The easiest way to use this crate on Windows is to just download the llvm+clang binaries from here and drop libclang.dll into somewhere rustc
can find it. I added it to <rust>/lib/rustlib/x86_64-pc-windows-gnu/lib
.
Edit: I'll add this information to the README.md.
I believe the vectorizer is already pretty eager so it's more about structuring your code so that vectorization is possible to begin with.
There's two different kinds of vectorization: performing the work of several iterations at once, and combining similar calculations in a single iteration into a fewer number of vector instructions.
It's a little bit technical and focuses on C++, but LLVM's documentation on the vectorizer helps give some insight into the kind of cases it can optimize: http://llvm.org/docs/Vectorizers.html#features (I can't say for certain whether it can do all these optimizations on IR generated by rustc
.)
Generally, the vectorizer is pretty good at optimizing loops as long as they don't abuse control flow too much or have too many side-effects. If you're just performing some calculations in a tight loop, LLVM will probably vectorize it without a second thought. If you're printing to stdout and inserting elements into a HashMap
, some sections might be vectorized but most of them probably won't be, because each element can trigger entirely different behavior.
I created a sample of a few different functions which vectorize cleanly: http://is.gd/gq0axi
If you select "Release" and then hit "LLVM IR" and search for the function names, you should see under each a line that reads:
br label %vector.body
That's a clear indicator that the function was vectorized, and in fact in each %vector.body
label we can see operations on what is effectively an i32x4
, for example in the vectorized loop for sum
:
%5 = add <4 x i32> %wide.load, %vec.phi %6 = add <4 x i32> %wide.load13, %vec.phi11
I'm not quite sure what those operands are, but add <4 x i32>
is definitely a SIMD instruction.
Nah, bitcode is really just LLVM IR, which isn't arch-independent. By the time you've compiled to IR, you've already baked in some platform-specific assumptions.
Source: http://llvm.org/docs/FAQ.html#can-i-compile-c-or-c-code-to-platform-independent-llvm-bitcode
Windows is a primary target for the LLVM project: It gets just as much attention as OS X and Linux (and *BSD).
If you were to look at the downloads page you would notice a clang installer for Windows- Neat, huh?
That assertion failure is coming from LLVM, probably because the LLVM ptrtoint
instruction is supposed to convert to an integer type. http://llvm.org/docs/LangRef.html#ptrtoint-to-instruction
Julia is apparently generating invalid LLVM code for this?
julia> f1(a) = reinterpret(Float64, pointer(a)) f1 (generic function with 1 method)
julia> f2(a) = reinterpret(Float64, reinterpret(Int64, pointer(a))) f2 (generic function with 1 method)
julia> @code_llvm f1(a)
define double @julia_f1_21195(%jl_value_t*) { top: %1 = bitcast %jl_value_t* %0 to i8** %2 = load i8** %1, align 8 %3 = bitcast i8* %2 to double* %4 = ptrtoint double* %3 to double ret double %4 }
julia> @code_llvm f2(a)
define double @julia_f2_21196(%jl_value_t*) { top: %1 = bitcast %jl_value_t* %0 to i8** %2 = load i8** %1, align 8 %3 = ptrtoint i8* %2 to i64 %4 = bitcast i64 %3 to double ret double %4 }
The classical FDO is not designed to be shipped with the source, but it is one of design goals of auto-FDO. Also LLVM seems to have some work done on this front http://llvm.org/devmtg/2013-11/slides/Carruth-PGO.pdf moving instrumentation to language frontend do be able to track the sources. I did play with LLVM's version and it did not seems better than the strategy of throwing away profile of functions that no longer match (and the format is also a binary blob), but perhaps it just needs more work. I guess moving instrumentation on that high level needs to be justified, because it is a lot harder to do ;)
> probably safer than the Rust implementations of the same
They are certainly more mature, but I am not sure they are that safer. C++ is incredibly complex:
std::vector
, std::deque
, ...)This means that C++ implementations require more code just to deal with the exponential explosion of the number of situations (trying to eek the last ounce of performance in each and every one) whilst at the same time facing a reluctant language. I mean, this is libc++'s vector, look at the size of this file and at all the sub-routines that insert
calls (__move_range
, __split_buffer
, __swap_out_circular_buffer
). Oh, and did you see all this debug code to try and catch iterator invalidations?
Now, look at <code>Vec::insert</code>: 20 lines of code and the only subroutine worth mentioning is reserve
. Why? Because a move does not throw in Rust.
As Hoare said:
> There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.
So, yes, C++ implementations are battle-tested. Or at least the big ones are. However, I would contend than Rust may be in a position of offering similar or better guarantees right now, because simpler implementations are much easier to check.
And as we already mentioned, in Rust the collections are safe to use... because let's be honest, most crashes of C++ with backtraces originating in the Standard Library are due to unsafe usage, not to bugs in the library itself.
This link has been spammed for several days now and it's always from user accounts a few hours old.
If any Redditors are actually interesting in developing a language, I highly recommend checking out LLVM's Kaleidoscope tutorial. It's obviously LLVM specific and it's not heavy on compiler theory, but I think it's a good introduction for someone who just wants to get something working.
Announcement text quote:
Hi all!
Finally, LLVM 3.6 has been released! See the release notes here: http://llvm.org/releases/3.6.0/docs/ReleaseNotes.html Downloads: http://llvm.org/releases/download.html#3.6.0
Also note that LDC is mentioned in the release notes as one of the projects who are already supporting LLVM 3.6. Just recompile LDC using master branch from GitHub or from the 0.15.1 source.
This is the 6th time that LDC and D are mentioned in the LLVM release notes!
Regards, Kai
No, because having C as an intermediate language limits what your host language can do. For example, you can have a language that requires tail call elimination semantics - even for function pointers - but you cannot express that in C (you can only have hope that the C compiler does it for you).
Of course, if you never aspire to do certain things C cannot, then it's alright ;-).