|
Apologies for the shouting but this is important.
When answering a question please:
- Read the question carefully
- Understand that English isn't everyone's first language so be lenient of bad spelling and grammar
- If a question is poorly phrased then either ask for clarification, ignore it, or mark it down. Insults are not welcome
- If the question is inappropriate then click the 'vote to remove message' button
Insults, slap-downs and sarcasm aren't welcome. Let's work to help developers, not make them feel stupid.
cheers,
Chris Maunder
The Code Project Co-founder
Microsoft C++ MVP
|
|
|
|
|
For those new to message boards please try to follow a few simple rules when posting your question.- Choose the correct forum for your message. Posting a VB.NET question in the C++ forum will end in tears.
- Be specific! Don't ask "can someone send me the code to create an application that does 'X'. Pinpoint exactly what it is you need help with.
- Keep the subject line brief, but descriptive. eg "File Serialization problem"
- Keep the question as brief as possible. If you have to include code, include the smallest snippet of code you can.
- Be careful when including code that you haven't made a typo. Typing mistakes can become the focal point instead of the actual question you asked.
- Do not remove or empty a message if others have replied. Keep the thread intact and available for others to search and read. If your problem was answered then edit your message and add "[Solved]" to the subject line of the original post, and cast an approval vote to the one or several answers that really helped you.
- If you are posting source code with your question, place it inside <pre></pre> tags. We advise you also check the "Encode HTML tags when pasting" checkbox before pasting anything inside the PRE block, and make sure "Ignore HTML tags in this message" check box is unchecked.
- Be courteous and DON'T SHOUT. Everyone here helps because they enjoy helping others, not because it's their job.
- Please do not post links to your question in one forum from another, unrelated forum (such as the lounge). It will be deleted.
- Do not be abusive, offensive, inappropriate or harass anyone on the boards. Doing so will get you kicked off and banned. Play nice.
- If you have a school or university assignment, assume that your teacher or lecturer is also reading these forums.
- No advertising or soliciting.
- We reserve the right to move your posts to a more appropriate forum or to delete anything deemed inappropriate or illegal.
cheers,
Chris Maunder
The Code Project Co-founder
Microsoft C++ MVP
|
|
|
|
|
A C# program is compiled with a compiler written in c++
A C++ program is compiled with a compiler written in assembly
An assembly program is compiled with a compiler written with machine instructions
Is that roughly how it works?
How does programming with machine instructions take place?
Let’s say I have the ASM MOV command, translated to machine instructions that’s probably one or two numbers. One telling the processor it’s an operation rather than a register or memory address and the other: which type of operation exactly it is. I’m making up stuff for how thing could work.
To get the above functioning on all processors a standard should be required where the numbers/machine instructions for MOV are recognized everywhere. I mean it should work like a hardware resource with the same ID present on old and new processors.
|
|
|
|
|
Calin Negru wrote: A C# program is compiled with a compiler written in c++
A C++ program is compiled with a compiler written in assembly
An assembly program is compiled with a compiler written with machine instructions
Is that roughly how it works?
Not exactly. It isn't a hierarchy where each language is translated in a lower level language. Let's put aside the C# for the moment because it isn't not exactly compiled (I'll explain in a bit). For the other languages the compiler is a machine language program (the processor cannot execute anything else) that translates the source code directly into machine language. In general you don't care in what language the compiler was written. These days it's rare to have an assembler or compiler written in assembly language. Most/all of them are written in C, C++ or other high level languages.
C# is a bit different because the compiler translates the source program into code for a virtual machine. This is called IL (intermediate language). When the program gets executed, the parts that need to be executed are translated into machine code. This is called Just-In-Time (JIT) compiling.
Edit: I left out a lot of details and exceptions. For instance the first C++ "compiler" was actually a preprocessor that translated C++ to plain C. This is a very rough sketch.
Mircea
|
|
|
|
|
Calin Negru wrote: A C# program is compiled with a compiler written in c++
A C++ program is compiled with a compiler written in assembly
An assembly program is compiled with a compiler written with machine instructions
Is that roughly how it works?
Nope, that's not how it works. For example, the Roslyn compiler platform for .NET is written in C#. The very first C# compiler was probably written in C/C++, but not subsequent versions.
A MOV instruction is not the same size on all platforms. For example, a MOV instruction, with operands, on an 8-bit CPU is not the same size as it is on 64-bit CPU's. What the op-code value is determined by comes down to available addressing modes for the instruction, available registers and their width, instruction decode logic and hardware in the CPU, data bus width, address bus width, and a sprinkle of arbitrary. Since there are vast differences in CPU design, the hardware makes it impossible to have the same representation across all CPU's.
Think about it. On a 64-bit CPU, how are you going to have a "MOV r, imm" (MOVe immediate to register) be represented and work exactly the same on an 8-bit CPU when it doesn't have registers that can hold a 64-bit integer?
There's far more to this than what I've posted, but this scratches the surface of why that idea will never work.
|
|
|
|
|
A compiler for "any" programming language can be written in "any" language.
Quite a few compilers throughout history has been written in themselves. Usually, you cannot start out with that (I'll come back to that below): You must write the very first compiler in some other language. Often, that first compiler handles only a small subset of the new language. When developing Pascal, Wirth tried to write this very first complier in Fortran, but gave up: While you can write a compiler in Fortran, it certainly isn't a language well suited for the task. So Wirth changed to assembly for the very first small-subset-Pascal compiler.
(I know of an operating system that was written in Fortran, but most people refuse to believe that!)
Once you've got that subset-Pascal (or whatever language we are talking about) up and running, you program the next compiler version in that subset-Pascal, but now you make a more advanced compiler, maybe for the entire, un-subsetted language. Now you have a full compiler written in itself.
Most likely, that subset-Pascal was so limited that you had to program in less elegant ways to get around the limitations. So maybe you program a third Pascal compiler, but since you have now got a full-featured compiler at your hand, you can program version 3 using all the great new features of your new language.
This process of going from a first (here: programmed in assembly) compiler to the second (programmed in subset-Pascal) to a third (programmed in full-featured Pascal) is referred to as 'bootstrapping'.
I know of one case where a full-featured compiler was written in itself, using all the features of the language, and there never was another compiler involved: The language was even more primitive than K&R C, called 'NPL'. Its developer wrote the NPL compiler in NPL, so he knew very well what an NPL line would compile to. So he started at the top of the NPL compiler source code, and typed into a new file the machine instructions that the compiler should generate for the first line. And for the second line. And for the third ... Down to the last line of the NPL compiler source code. When he ran the compiler source through that program he had just been typing in, he got a new file with same contents that what he had typed in, instruction by instruction. So the compiler worked as expected!
(That guy was slightly crazy: I was once complaining to him about a bug in the OS, which was written in NPL. He dug up the OS source code - this was in the days of hardcopy printout - and found the function I was complaining about. After some grunting and huffing, he spotted the error, and dug out a ball point pen to write a correction into the printout. Did he write the corrected statements in NPL, the language of the printout? No. Did he write it in symbolic assembly code? No. Did he write it as the the octal representation of the binary instruction codes? Yes, with offsets and all as octal values!)
|
|
|
|
|
You might want to double-check who you're replying to.
|
|
|
|
|
trønderen wrote: I know of one case where a full-featured compiler was written in itself, using all the features of the language, and there never was another compiler involved: The language was even more primitive than K&R C, called 'NPL'.
Forth is very close to that. And designed specifically like that. The most primitive basics are written (or were written in) assembler. Then some other items are added, written in those primitives. Then more are written which are based on the prior sets. And so it goes.
|
|
|
|
|
> A MOV instruction is not the same size on all platforms
That’s why you tell the compiler which architecture you’re targeting, I think I get it.
|
|
|
|
|
Usually, a machine instruction is a word of, say, 32 bits. The first 6 to 8 bits (typically) tell what this instruction does: Move data, jump to somewhere, add two values etc. The next few bits may indicate how you go about finding the operand, i.e. what to move, where to jump or which value to add. The interpretation of the following bits depend on those (often called the 'address mode' bits): Either as a register number, how far to jump, or the value to add. If the address mode bits so says, maybe the value to add is not in the instruction itself, but the instruction tells in which register you can find the address of the value to add.
The compiler breaks down your code in more primitive operations, such as "add the value of X", without being concerned about what the proper machine instruction will look like. Not until it gets down to the very bottom, the 'code generator'. A compiler may provide different code generators for different machine types. A given code generator knows what an 'add' instruction looks like on x64 processors, knows the proper address mode bits and how to put the address of X into the instruction. Another code generator, for ARM, knows another code for ADD on the ARM, but it also knows that you cannot directly add something in memory. First the code generator must look up an unused register (and if there is none, it must generate an instruction to flush the value in one register back to memory to free it up), then generate an instruction to load X into the free register, and then generate an instruction to do the actual add of the newly loaded value.
There is no common, standardized code for neither move, jump nor add on different machines. The code generator knows the codes for this machine. You may tell the compiler to switch to another code generator knowing the codes for another machine (that is commonly referred to as 'cross compiling'), but the program that comes out of it cannot be run on this machine; you must copy it to another machine of the right type to run it.
In the good old days, there were dozens of different machine types out there, each with its own instruction codes. The last 35 years or so, the 'x86' has pushed most others out. Every PC in the world can understand x86 codes.
But x86 is for 32 bit machines! 64-bit PCs understands 'x64' codes (and address modes), which are different! If you want that move, jump or add to work on a 64-bit PC, your compiler must select a code generator knowing the proper codes for x64. The program will not work on an old x86 PC.
Relax ... The x64 CPUs can be told to forget the x64 codes, and run the x86 instead. The .exe file tells which instruction set should be activated, so you can run 35 year old programs on your brand new 64 bits PC (provided that your current OS will honor all the requests made by that old program, which is not guaranteed - but 20-25 years is probably on the safe side).
Then, what about your smartphone - will it know x64 move, jump or add instruction codes? No. Will it know x86 instruction codes? No. You have to ask your compiler to use the code generator that makes ARM instruction codes. ARM has 32 bit and 64 bit codes too, that are different ... Besides: If your program was written for Windows, it expects that is can ask the OS (i.e. Windows) for this and that service - and Android says Huh?? The services provided by Android are quite different.
Bottom line: The rosy, cosy days when x86 worked everywhere are over.
Then comes dotNet. When a dotNet 'assembly' (informally you may call it a module) is loaded into your machine, smartphone or whatever, you'll see an incompletely compiled program: The compiler has left a message: Here I should have generated the code for adding the value of X, but I didn't know what kind of machine this will be running on! So please, before running this program, generate the proper instructions for adding X, will you?
dotNet for a given machine has the proper code generator for that machine. dotNet on an ARM32 generates ARM32 codes, dotNet on an ARM64 generates ARM64 codes, dotNet on a 64 bit PC generates x64 codes. They are all different.
For now, Windows itself is not dotNet, so it must be compiled separately for every machine architecture. A growing number of applications are dotNet, incompletely compiled, and the last step of compilation, code generation, is not done until you know for sure which codes to use, just in time for execution. So the dotNet code generator is frequently referred to as the "just in time compiler", or "jitter".
ps.
If you want to look at instruction codes and addressing modes and such to see what they are like, my recommendation is to stay away from x64 and x86. They are both a mess, having grown and been extended and grown more and been extend ... into a crow's nest. ARM64 (aka. Aarch64) is certainly not the simple, easily understood thing that the early 32 bit ARMs were, yet it has retained a much more manageable structure.
|
|
|
|
|
tronderen and Dave, thanks for your extensive explanations. At this point I don’t understand all the bits but I think I’m closing in
|
|
|
|
|
All you really need to understand is:
- a compiler can be written in anything (more or less) from machine code, through assembler up to most high level languages.
- The output of the compiler must be code that is compatible with the machine that will run the final executable.
- the term "machine" can be the actual hardware, a virtual machine (like the Java Virtual Machine), or Framework such as .NET.
- the actual hardware instructions do not have to be the same across all platforms, but it would be nice. Just as USB connectors keep changing so hardware platforms keep evolving.
|
|
|
|
|
You can write your own compiler.
Simplest way is to do it, without doing must studying, is to write a 'calculator' which takes tokens like numbers and the plus sign and equals sign.
After you do that then read up a bit more on compilers and apply some of what you read to what you previously wrote.
Then add variables.
If you really want to keep going after that then you add 'if-then-else' because that structure has problems of which compiler theory talks about quite a bit.
|
|
|
|
|
.
modified 6hrs 20mins ago.
|
|
|
|
|
Where did you find this expression?
|
|
|
|
|
Did you read the documentation[^]?
"In testa che avete, Signor di Ceprano?"
-- Rigoletto
|
|
|
|
|
Maybe if you actually provided the full details ...
|
|
|
|
|
[feb 2,2023,3:01am] same as Victor, could you describe the context of that message? A veteran probably needs no further clue to guess the source of where that came from but some of us are not veterans (not me at least)
|
|
|
|
|
Because you're too lazy to do the work yourself, you post this nonsense just so you can down vote answers and legitimate questions? You are a troll.
Quote: From the perspective of an application, a "cancellation point" is any one of a number of POSIX function calls such as open(), and read(), and many others.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Gerry Schmitz wrote: You are a troll. I tend to agree; he certainly has history. Also complains of being thrown out of other forums for "asking too many questions", but I suspect the real reason is not that.
|
|
|
|
|
|
Hi
just had some discussions. on IBMMAIN regarding the C++ code (DLL) using C++ I developed and tried porting to z/os
since I need a lot of the same functionality
I have been compiling on Z/OS XL C++ and I got some differences
as an example
auto ret = procpointer->extsymcollector->insert({ *s, *exsympointer });
where the XL C++ compiler didnt like the '{'
I was told by someone who works on the XL C++ compiler to ditch MSVC
and go with CLNG/LLMV
by going here
Clang/LLVM support in Visual Studio projects | Microsoft Learn[^]
As MSVC only goes to C++ 11
in addition I was told to ditch XL C++ and go to Open XL C++ As that goes to C++ 17 or 18 and is baseD on CLANG/LLVM
|
|
|
|
|
|
ForNow wrote: As MSVC only goes to C++ 11
No, it fully supports C++20.
Go to project properties pages -> General -> C++ Language Standard and select "ISO C++ 20 Standard".
Now, for that particular piece of code, the "{}" is the C++ initialization syntax available since C++11. You can try replacing that with:
auto ret = procpointer->extsymcollector->insert(T(*s, *exsympointer)); where T is the type of object that is inserted.
Mircea
|
|
|
|
|
David Crayford who works on the XL C\C++ z/os compiler suggested I switch my compiler from MSVC to CLANG\LLVM for a few reason one then seems to be easier portability
What’s your opinion
Thanks
|
|
|
|
|