February 23rd, 2007, 01:11 PM
2 or more cores working as 1???
With all the hype on increasing the number of cores and architecture over the core speeds wouldn't it make sence, for the number of power users, to be able to have an operating system or program run 1 program that is made from a single thread to a multithreaded program. I dont mean parallel processing. The ability to have a single thread use both cores fully would be a true breakthrough both in hardware and software achievements. Right now, an operating system can only balance the load between the number of cores. But to have a program that could use the FULL power of a multi core processor would be very benificial to power users.
Having a 4 core system and all running at 1.8Ghz *cough*Core2Quad*cough* is a bit of a step backwards if you ask me. Yea, some programs can use the 4 streams allowed, but is there many users that would constantly use the 4 cores to their potential? The same goes with the Dual core processors. Theres not many things that can even REALLY utalize both cores.
Theoretically this would be very easy to emulate. Though you would loose a fracton of the processor power and a small bit of memory, the gain would be very benificial if you plan on only focusing on one program being used.
I personally have a Core 2 Duo E6300 (couldnt afford the E6600 at the time). The basic clock speed is 1.8Ghz but with over clocking I can reach a little over 3Ghz on air cooling (yes, stable). But with the multiple cores, im still finding that many of the programs that are available dont use the processor to its full potential.
You have to admit... having the power of 6Ghz on one application is a dream being waited for. Exspecially with the arrival of Quad cores and soon (not saying within a year or 2) Octo-core processors. You can build a Quad processor Pentium D board and get very nice multitasking functionality. But why not be able to create multiple streams form one program.
*Bios option for Single core emulation*
*OS option for single core emulation*
Take your pick. Im sure you could also make an OS that loads another OS and have the first OS loaded be the single core emulator and have the second be the main OS.
<I WANNA USE BOTH MY CORES EFFECTIVELY DAMMIT!!!!!
February 23rd, 2007, 01:54 PM
you need to look at the reasons for going multiple cores, a dual core 3GHz CPU will only use a fraction of the total power that a 6GHz CPU might use, and it is far far cheaper and easier to build, while at the same time it is able to provide nearly identical processing power, and thats why you can buy a quad 1.8Ghz CPU but not a 4Ghz CPU
as for your applications not using it all, well thats the application itself, when you have multiple cores it needs to be coded to actually use them, the developer must decide what can be done out of order and put that stuff in a separate thread so it can be executed in parallel on a separate core, but it requires work by the developer. A lot of the newer games do provide the necessary changes to use multiple cores, many games and pro apps have support for the multiple cores too, a lot of the smaller apps that are not too CPU aggressive don't really use all cores, but most the the big consumers of CPU have been updated, and then you can look at the server environment, there multiple cores has been the norm for years, and everything uses all cores to their full potential
and for "single core emulation", well AMD has said they are working on it, but its not as easy are you think, a dual core CPU might be able to do to two things at once, but a program that is only written to support one core is written to do one thing, and then use the result immediately (such as add two numbers and then multiply it with something, (1+2)*3), you can do the 1+2 in one core, but the other core can't do the next step (3*3) until it knows the result of 1+2, if you wait for core 1 to do 1+2 then it is faster to do 3*3 in core 1 because it is idle after that operation and transferring the result to core 2 would be very very slow and do nothing but waste time slowing everything down, where a program written for two cores will typically cut stuff into large chunks where order doesn't matter, and then tell the OS to do all of it, and it returns and then combines the results, maybe one part will just calculate the physics for the environment, one will manage textures, and one part will do the models, then one part will combine the information and send it to the video card, but this require that the programmer define what those parts are
February 23rd, 2007, 03:05 PM
Great thread! We are looking at getting a new Dell actually and I have been reading more about these. I never really got too much into building the computers but the more I am understanding about dual core processing the more I am getting disheartened by it unfortunately.
And most of them defalult at 1.86Gh but some programs require a minimum of 2Gh to run and at first I thought that would be no problem but 1.86X2 does not equal 3.72. So now I am back at the drawing board trying to figure it out.
But I understand more now - thanks to the posts above. I just want a fast machine
February 23rd, 2007, 03:21 PM
well when you see an application say it requires a 2Ghz CPU, generally they mean a 2GHz P4 CPU, but a P4 is one of the slowest CPUs out there for its power per hertz, a ~1.6-1.8 GHz CPU from AMD will have approximately equal power per core, and when you see a new computer today with a speed of 1.8-2.4Ghz look at the CPU, its probably a Core 2 Duo, which can easily be 20-40+% faster per hertz then a P4, my Core Duo is 2.16 GHz and i can run a single threaded app at approximately the same speed as my 3.3GHz P4, with a multi-threaded application its twice as fast, and the Core 2 Duo is about 20% faster then that per hertz, so that 1.8Ghz computer you are looking at should be considered to be something like having two 2.4-2.6GHz P4 CPUs for the purpose of comparing to the label on the box of some piece of software, so in your case that 1.8GHz CPU is very much fast enough for your application that needs a 2GHz CPU
Originally Posted by Corey Bryant
February 23rd, 2007, 05:49 PM
i guess i should have been more specific .... a small text graphic should be fine ....
L2 cache----- Main OS when booted ------ "Windows"
example .... Program makes thread A ... "Windows" sends the data to the LOGICAL CPU (actually the first OS loaded) the logical cpu then splits the thread to each core. But unlike multithreading have the "program" actually use the processing power of both cores.
I know the reason for changing focus from higher core speeds to multiple cores. With the architecture of that time heat became an issue when trying to surpass the "4Ghz barrier" and so designers decided to decrease clock speeds and increase cores. Demand for multitasking grew also so it was a viable solution. BUT ... eventually designers will have to take another step forward, that of which wont be simply increasing speeds or the number of cores, but instead the task will be the COMPLETE allocation of multiple cores to the processing power.
Though ironically there are some CPU's, with some decent cooling, that can surpass 4Ghz (Pentium D that clocks to 2.66Ghz stock i believe) and be stable. But they wouldnt be effective with the amount of cooling needed. The main focus has changed from the clock speeds to determine the power, to the architecture. This is where I can see superiority of either Intel or AMD becoming VERY clear. Though besides architecture, one thing ive noticed with AMD and Intel is the amount of L1 cache and the more obvious CPU bus speeds ... *drools over his fathers 2Ghz bus speed*.
side note : commenting on the use of a 1.8Ghz cpu now compared to the requirement of a 2.0Ghz P4 ... clock speds arent always the most accurate way to test a computers power .... for example ... a Pentium D (dual core) 3.0Ghz machine at my school gives me about 600KB/sec with winrar's benchmark program (simple and accurate) while when i have my computer clocked to stock speeds (1.8Ghz core 2 duo) i get about 800KB/sec .... when i overclock my machine to 3.0Ghz to match the clock of the Pentium D, i can get roughly 1300KB/sec (1200KB/sec at my current speed of 2.8Ghz)... there are many factors that determine the power of a machine, though with the CPU alone, the architecture, processor features, cache, and clock speeds all determine its "power" .... a 2.0Ghz Pentium 4 (it might be a 2.8 or 3Ghz in the library, too tired to remember) gave me roughly 200KB/sec ... so i would reccoment using a program to test computers you use (like winrar because its simple and very easy) if you would like to know the comparison of "power" between processors
Last edited by matkisson; February 23rd, 2007 at 05:55 PM.
Reason: the text graphic was not correctly formatted
February 23rd, 2007, 05:51 PM
Thanks for more information. I don't mind saying that I don't know I did not even know that the Core Duo would be faster on that level... I have been debating on whether to get a faster hertz or RAID (and with my recent problems with the HD - I was gearing more to RAID) but it looks like I will be able to have speed and RAID
Thanks again for that! And if you have a blog that you write to or anything like, feel free to let me know so I can maybe read a bit more. It is all very much appreciated
February 23rd, 2007, 05:58 PM
if your upgrading a regular computer i wouldnt worry too much about raid .... just try and get a sata2 drive for its speed .... as far as the CPU its more on a personal need basis. If your goin for gaming, try n get an AMD X2 ... they own (pop's computer) .... but if its for office work or just random cpu intensive things, get a Core 2 Duo, they can overclock ALOT incase you need the power
also .... dont get a dell .... they suk ... if you cant build one, I would suggest gettin a friend ... much cheaper and better
February 23rd, 2007, 06:17 PM
To the extent that it can be done AND maintain correctness of the executing code (no trivial task), CPU's have been doing this since the original 60mhz Pentium. That's what a superscalar CPU does - it has multiple execution units applied to the instruction stream.
Originally Posted by matkisson
IMO, in the real world, with modern bloatware, you're better off spending the die space on larger L1's.
February 23rd, 2007, 06:27 PM
This thread was mostly meant to comment on the posibility of allocating 2 cores to work as 1 ... which means not having the program require HYPER THREADING to fully utalize a dual or quad core system ... THIS hasnt been done and isnt quite near ... L1 cache would increase performance (look back on earlier post of AMD's l1 compared to intel's) but again, thats not a factor when trying to completely allocate cores or emulate a single core system on a multi core computer
February 23rd, 2007, 06:44 PM
Do you realize how expensive maintaining cache coherency is?
Originally Posted by matkisson
Hyperthreading has nothing to do with multiple cores, it has to do with utilizing the waits involved in pipeline stalls.
Comments on this post
Last edited by Purple Avenger; February 23rd, 2007 at 06:51 PM.
February 23rd, 2007, 07:14 PM
well you clearly have never programmed in assembly, if you did you would know why this wouldn't work, a computer program that ends up going to processor might look something like this, it would do x=(3+4)*5
Originally Posted by matkisson
copy 3 from RAM to register A
copy 2 from RAM to register B
add A and B (saves the answer in A)
copy 5 from RAM to B
multiple A and B (save answer in A)
copy A to RAM where x is stored
now just try to make that work any amount faster by using an extra core, you need to remember that RAM here is probably going to refer to the cache, which may be 5 times slower then the CPU, and that copying data from a register on CPU A to a register on CPU B is probably going to take 10-20 times as long as it does to just read/write to a register on the current CPU, looking at the above code that the processor has to execute just tell me how you would make that run on 2 cores
lines 1 and 2 can be done at the same time, but they must be done on the same core, as moving between cores takes more time then it does to execute the entire application, line 3 must be done after both 1 and 2 are finished, line 4 could possibly be done at the same time as 3, but again, it must be the same core as all other lines, the multiply line is dependent on the previous two lines, so again same core, same with the last line
in that example the only optimizations you can make are running the copy operations in parallel, and thats something that all modern processors do to some extent, they analyze the code long before it gets executed, and determine all the pieces of RAM that it will probably need soon, and they go through what is needed to get those parts moved to the fastest cache near the CPU its going to be needed at, another optimization used in modern processors is pipelining, they break the operations into many many steps, because the multiply operation for example may take 20 cycles but only uses the registers on the first and last the processor will start the multiplication and then on the next cycle it will start the execution of the next line, so in this case while its adding A and B it will also copy 5 to register B, so when the adding is done the multiplication can start immeditly (IIRC the C2D will do 4 instructions in parallel, while most others only do 3, its a major reason for its speed)
and with Intels hyper threading it goes even further, it shows two logical CPUs so one thread can do maybe an add operation in one thread while another thread does a floating point multiply, depending on the operations in both threads it can give a huge performance boost, but in reality its closer to the speed of a single core because if they both want to multiply one of the logical cores will effectively freeze waiting for the other thread to finish its operation
February 23rd, 2007, 08:04 PM
My gut suspicion is this sort of scheme would drive a quad Xeon down to 386/486 speeds . I just don't see how the existing code base could possibly benefit.