Advanced info on CPU Emulators

06-20-2011

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Quote:

Originally Posted by theKbStockpiler

Okay, I write code that mimics a CPU in that it has memory assigned to registers and flags and such to document the entire state of the CPU. I can't find a way around that I believe that for every target CPU instruction my emulator basically looks up a routine in a table and this routine changes the state of my phony CPU that exists in memory.

A table, or a giant switch() statement, or what have you. It just has to make a decision between a lot of instructions at once.

Quote:

Now I would have the phony CPU run the data and this would just be saved in memory. What would the process or application be called that puts this data in a certain form so the O.S would then run it and how would it do it?

[edit] I think I misunderstood your question.

I'm not sure I understand it, for that matter. Put the data in a format so the O.S can run it? Run data?

Are you asking how would you get useful data in/out of the emulator? Up to you I suppose. If you're expecting it to run programs meant for a different kind of computer, you'd probably need to emulate more hardware too, like DOSbox does.

DOSbox for example does what it says on the tin. It emulates a DOS machine well enough to run a good number of old games and applications. (On any system -- it doesn't depend on a real x86 cpu.) The list of hardware it emulates ends up being a bit daunting, including but probably not limited to:

VGA controller, which displays the contents of a special hardware memory area on screen. In an emulator, the memory would be ordinary RAM like everything else. Routines independent of the CPU emulator would be responsible for transferring its contents into an ordinary graphical window where you can see them.
Keyboard controller: DOS programs expect to communicate with the raw, old-fashioned keyboard controller, so DOSbox pretends to have one.
PC speaker: To do proper PC speaker beeps, DOSbox pretends to have the old-fashioned IBM PC timer chip. This chip is necessary for lots of other things anyway, like the interrupts DOS and many games use to mark time.
Soundblaster: It pretends to have one of these too. Or a Gravis Ultrasound.
Adlib card: Mostly an OPL2 hardware synthesizer chip. An earlier soundcard that the Soundblaster cloned and expanded with raw digital sound.
BIOS routines. In a real computer they're instructions stored in a special read-only area of memory. But since programs don't usually care what's in them as long as they work, DOSbox cheats a little, just triggering higher-level routines outside the emulated environment instead of doing all the work inside the emulated CPU. This includes some things like basic video mode changing, etc.
Operating system routines. DOSbox gives you an actual DOS-like prompt and most of the DOS routines a real DOS environment would give to programs. This includes things like opening, reading, writing, deleting, closing files. DOSbox cheats. When the emulator causes a software interrupt that'd make a real CPU to jump to a location in memory, the emulator catches it, and does a real, native system call which ends up doing what the program asked. Not keeping all that DOS code in emulated memory also has the happy side effect of a DOS machine which can boot with nearly 640K free
Comm ports and modems. In the days before widespread internet access most home telecommunication meant modems and COM ports. DOSbox can emulate having a dialup modem convincingly enough for some BBS software to work, but the pretend-modem is actually a TCP socket: When people connect to the socket, the DOS program is told the phone is ringing, and people can connect and talk with it with a terminal. Which is why we still have Tradewars 2001 in 2011

DOSbox even has several different CPU emulators -- a fast simplified one, a slower 32-bit one with protected mode, etc -- and switches between them depending on what CPU features are needed.

So there's a lot more to emulating a whole computer than emulating its CPU.

Last edited by Corona688; 06-20-2011 at 02:17 PM..

This User Gave Thanks to Corona688 For This Post:

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

06-21-2011

Registered User

1,613, 160

Join Date: Oct 2007

Last Activity: 12 February 2019, 12:19 PM EST

Location: USA

Posts: 1,613

Thanks Given: 40

Thanked 160 Times in 150 Posts

All the translation from the code you write into the target CPUs instructions would be done in software by the emulator...it basically provides a virtual environment so search google for virtual machine.

This User Gave Thanks to shamrock For This Post:

shamrock

View Public Profile for shamrock

Find all posts by shamrock

06-22-2011

Registered User

56, 1

Join Date: Mar 2011

Last Activity: 16 August 2014, 10:41 PM EDT

Posts: 56

Thanks Given: 30

Thanked 1 Time in 1 Post

Thanks for the Replies! My ignorance has been refined to....

I think I have figured out that part of the Emulator application is an Interpreter (if we choose this route) , that the Emulator programmer writes that uses the native target binary as source code (basically) and this Emulators output creates a phony CPU that exists by the means of the storage of it's states. I'm interested to know if I have to translate machine code by hand to assembly or how that works?

The phony Emulators output is not compatible with the host in this case of study. The part right now I can't grasp is where does the Emulation stop and when can the host use it? If I Emulate the hardware it is still in the form that the host can not use. If I don't actually get foreign output I have not emulated anything but maybe translated.

I'm thinking that emulation is mostly a structured way to translate the program. Emulation seems like it is a painfully in-direct. Why not just translate it to begin with?

theKbStockpiler

View Public Profile for theKbStockpiler

Find all posts by theKbStockpiler

06-23-2011

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Quote:

Originally Posted by theKbStockpiler

I think I have figured out that part of the Emulator application is an Interpreter (if we choose this route) , that the Emulator programmer writes that uses the native target binary as source code (basically) and this Emulators output creates a phony CPU that exists by the means of the storage of it's states. I'm interested to know if I have to translate machine code by hand to assembly or how that works?

It doesn't use it as source code as much as byte code. It doesn't have to compile or even translate it into local machine code -- all the programmer has to know is what bytes mean what instructions.

How about an imaginary processor with three registers and three instructions?

Code:

#include <stdio.h>

int main(void)
{
        int running=1;
        // The memory the emulated program is read from
        unsigned char program[]={0x01, 13, 0x02, 12, 0x03, 0x00 };
        unsigned char a=0, b=0, ip=0; // A reg, B reg, instruction pointer

        while(running)
        {
                printf("A=0x%02x B=0x%02x IP=0x%02x\n", a, b, ip);
                switch(program[ip++])
                {
                case 0x01:  // Load byte into a
                        printf("LODA 0x%02x\n", program[ip]);
                        a=program[ip++];
                        break;
                case 0x02: // load byte into b
                        printf("LODB 0x%02x\n", program[ip]);
                        b=program[ip++];
                        break;
                case 0x03:
                        printf("ADD B,A\n");
                        b += a;
                        break;
                case 0x00:
                        printf("HALT\n");
                        running=0;
                        break;
                default:
                        printf("ERROR invalid instruction 0x%02x\n", program[ip-1]);
                        return(1);
                        break;
                }
        }

        printf("A=0x%02x B=0x%02x IP=0x%02x\n", a, b, ip);
        return(0);
}

A real processor would be much more complicated of course. x86 and x86_64 for instance have instructions of different sizes, some as few as 1 byte and some more than 12.

Quote:

The part right now I can't grasp is where does the Emulation stop and when can the host use it?

I'm still not sure what you mean.

Quote:

If I Emulate the hardware it is still in the form that the host can not use.

emulation doesn't turn a foreign program into a local program. Usually you emulate more of the system and interact with the system itself.

How to get the data out depends on what's being emulated how. DOSbox for example supports files, you could write the data you wanted to file to make it available in the host OS. It can also pretend to have a fake modem or serial port, letting things outside the emulator connect to programs inside the emulator over TCP.

Quote:

Emulation seems like it is a painfully in-direct. Why not just translate it to begin with?

How to do that isn't always obvious. How do you make 16-bit real-mode base/offset style memory access work in 64-bit protected mode with all the same side-effects? How do you translate instructions your processor has no direct equivalent for -- some architectures have weird ones, like "skip next instruction if bit N is set in register W". You could do it in two instructions but then you'd have to worry about whether the flags register got altered by instruction 2 in ways that change how the program will branch later. How do you keep track of what your instruction pointer is supposed to be when the instructions aren't the same size they used to be? What about instructions like REP STOSB which are entire little self-contained loops?

And once you translate it, then what? The program's still going to expect to be talking to its native operating system and not yours. You'd need to write your own substitues. (Sort of what WINE does. WINE doesn't need to emulate anything, since it's running x86 programs on an x86 machine, but it does need to provide the Windows libraries that Windows programs expect.)

Not saying it's impossible, but it wouldn't be easy, would be difficult to debug, couldn't be ported anywhere, and could end up being as much overhead as just emulating it.

Last edited by Corona688; 06-23-2011 at 03:13 PM..

This User Gave Thanks to Corona688 For This Post:

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

06-26-2011

Registered User

56, 1

Join Date: Mar 2011

Last Activity: 16 August 2014, 10:41 PM EDT

Posts: 56

Thanks Given: 30

Thanked 1 Time in 1 Post

I think I have part of my confusion worked out.

I assumed that the emulated CPU would execute the code and not just keep track of it's own state. Simulating a CPU is really only (Keeping track of it's state). Therefore this emulated CPU's executed code would not be of any use other than to the emulated CPU.

Let's say one person is writing in a foreign language. They could write an essay while referecning other documents in the same language and it would be compatible with it's self. If I needed information parsed in this foreign language the answer could be obtained with just Emulation but for a speaker of a different language to understand it , it would have to be simulated or actually converted.

Using the switch statement is really simulation because the system call or whatever is being substituted with one that is compatible with the host. The emulated CPU is not parsing information but a selection is being chosen from a list which is not emulation but simulation.

I'm still trying to figure out why the state of the CPU is important when the code is simulted anyway but I wanted to keep this thead alive.

Thanks for the great relies!

theKbStockpiler

View Public Profile for theKbStockpiler

Find all posts by theKbStockpiler

06-26-2011

Registered User

945, 306

Join Date: Jun 2011

Last Activity: 1 January 2020, 5:25 PM EST

Location: South Carolina, USA

Posts: 945

Thanks Given: 32

Thanked 306 Times in 284 Posts

Most CPU opcodes just move memory around. It's the devices attached to the I/O ports that make it useful. Check out this JavaScript PC emulator: Javascript PC Emulator

The only external device he's emulated is the serial port. I suppose it catches the memory getting put into 0x3f8 and those go to his host's terminal.

It's tough to correctly emulate a foreign machine. Sometimes the code modifies itself. You can't often step through it, replace it with host machine equivalent, and save the output. There would be differences in memory locations for one.

If you'd like to start with something easy, the 6502 CPU would be a great start. It's super-old but I found a 6502 core in an embedded IC recently and was having fun with 6502asm.com - 6502 compatible assembler and emulator in javascript learning it.

This User Gave Thanks to neutronscott For This Post:

neutronscott

View Public Profile for neutronscott

Visit neutronscott's homepage!

Find all posts by neutronscott

UNIX for Dummies Questions & Answers

Advanced info on CPU Emulators

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Best ways to get clear info about CPU and Memory

Discussion started by: umen

2. AIX

To get only the cpu info from the topas command terminal

Discussion started by: rpm120

3. Shell Programming and Scripting

Perl agent which calculates CPU info and more

Discussion started by: sania.mirza

4. Shell Programming and Scripting

Generic command for CPU info

Discussion started by: sgoiffon

5. HP-UX

CPU Info

Discussion started by: yoavbe

6. AIX

How to access process and cpu info on AIX?

Discussion started by: DarthVader77

7. Gentoo

top in batch mode, cpu info is wrong

Discussion started by: broli

8. HP-UX

cpu info

Discussion started by: vijayca

9. UNIX for Dummies Questions & Answers

bus speed and CPU info

Discussion started by: ihot

10. UNIX for Dummies Questions & Answers

any emulators

Discussion started by: dep