ParallelFox and HyperThreading

Years ago, Intel added a HyperThreading feature to their CPUs before dual-core processors were available. More recently, Intel reintroduced the technology into their “Core i” series of processors. What is HyperThreading and how does it affect ParallelFox? Let’s start with the Wikipedia description:

Hyper-threading works by duplicating certain sections of the processor—those that store the architectural state—but not duplicating the main execution resources. This allows a hyper-threading processor to appear as two “logical” processors to the host operating system, allowing the operating system to schedule two threads or processes simultaneously. When execution resources would not be used by the current task in a processor without hyper-threading, and especially when the processor is stalled, a hyper-threading equipped processor can use those execution resources to execute another scheduled task. (The processor may stall due to a cache miss, branch misprediction, or data dependency.)

What this boils down to is that a single thread/process will not utilize all of the execution “slots” or “units” in a CPU core. This is especially true when the processor is “stalled”, meaning that the processor is waiting on something before it can continue. This may be due to the inherent design of the CPU, or because it is waiting for data to be accessed from main memory. HyperThreading allows a second thread/process to utilize the unused execution slots. Generally, this is a good thing and can provide a 15-30% performance boost to parallel processing.

However, in cases where there is heavy competition among threads for the same execution slots and other resources, HyperThreading can be slower than running a single thread on each core. The examples that ship with ParallelFox exploit this weakness. On a single-core HyperThreading CPU, the “after” examples are actually slower than the “before” examples. Of course, this was not intentional. The reason is that the examples simulate work rather than resemble real-world code. Here is the SimulateWork() function:

Procedure SimulateWork
   Local i
   For i = 1 to 1000000
      * Peg CPU
   EndFor
EndProc

While this code does a good job of pegging a CPU core at 100%, it also causes the same few instructions to be executed millions of times. With HyperThreading enabled, competition between the two threads for the same CPU resources is extreme. In a real-world scenario, there would likely not be this much competition for resources and HyperThreading would be beneficial.

As with most things, your mileage may vary. If you find that your code runs slower with HyperThreading, you can tell ParallelFox to use only half of the “logical” processors and start only one worker per physical core. Here is example code for that:

* Use only physical cores
If Parallel.DetectHyperThreading()
   Parallel.SetWorkerCount(Parallel.CPUCount / 2)
EndIf
Parallel.StartWorkers("MyApp.EXE")

ParallelFox uses WMI (Windows Management Instrumentation) to detect if HyperThreading is enabled. WMI has shipped with windows since Windows 2000. However, WMI can only detect HyperThreading on Windows XP SP3, Windows Server 2003, and later versions, because that is when Microsoft introduced the required APIs. On previous versions of Windows, Parallel.DetectHyperThreading() will always return .f. even if HyperThreading is enabled.

There are several other features I want to add to ParallelFox, but at this point, I think it is feature complete for version 1.0.  Also, very few issues have been reported from previous versions, so I am moving this release up to release candidate status.

Download ParallelFox at VFPX.

 

9 Replies to “ParallelFox and HyperThreading”

  1. Hi Joel:

    We have a legacy application that will should continue to run some more years until replaced, but meanwhile we are trying to enhance some slow processings that can be run in parallel, so obviously, I did think on your ParallelFox library 🙂
    What I want to ask is this: given what you said in this article about Hyperthreading not being always beneficial, could help to add some strategic doevents in the code so no process use the CPU to 100%?

    Many thanks and great work!

    1. Hi Fernando,

      My tests for this post were with an old single-core chip that had the original HyperThreading. Sometime after writing this, I ran tests on an i7 processor, and I didn’t have the same blocking problems. It turns out that Intel added some hardware to the CPUs to reduce blocking when they reintroduced HyperThreading. I thought I had updated this post or written that somewhere, but I guess not (sorry about that).

      My suggestion is to treat those HyperThreaded logical cores as real cores and see how things go. If you’re processing data, there are lots of bottlenecks (hard drive, memory, etc.) that you’ll experience before pegging the CPU. If you do find that one process is hogging the CPU, you can use Worker.Sleep(0) to force a task switch. See the “Scalability and Performance” section in the Help file for more on this subject.

  2. Hi Joel!
    Was a long time from my last question, I hope you don’t mind another one 🙂

    Last time I made some testing but couldn’t implement your library because time problems, but now I have another chance to do it.
    My question this time is about user interface response: I’ve implemented a simple form with a button and an editbox, the button runs a blocking process with the parallell library, and meanwhile the UI is unresponsive, so I can’t cancel or do nothing on the UI and must wait to the process to finish.
    Is this normal, or is there a way to making the UI responsive while a blocking process is running. I’ve tried using the Sleep() with some miliseconds, but do not work either, and did search for some example similar to what I need, but all PRGs are use cases in which the processes are executed, but without any participation of the user on the UI.

    Many thanks!

    1. Hi Fernando,

      Are you issuing a Parallel.Wait() in the main UI process? That intentionally blocks the main process, so that it will wait until the workers are finished running. If you want the UI to be responsive, you will need to remove Parallel.Wait(). Then use Parallel.BindEvent() if you need to detect the progress of the workers.

      If that’s not it, is your CPU pegging at 100%? Without looking at your code, I’m not sure what else could be causing this.

    1. Fernando,

      The VFPX forums are the best place, although I haven’t done a very good job of monitoring those lately. I believe VFPX also has a way to send me a direct message. And of course, I get notified of comments here. I get enough spam as it is, so I don’t like to post my email address publicly.

      Thanks,

      Joel

  3. Hi, Joel!
    Firstly I want to thank You for your ingeniously project ParallelFox. I’ve faced the task when I badly need multithreading in my VFP project and I didn’t know what to do when I came across your ParallelFox. So thank You once more!
    But I encountered a trouble deploying it, I did my best but couldn’t solve it. All your examples are working best in debug mode but when I changet it to release mode (Parallel.StartWorkers(FULLPATH(“Call_After.prg”),,.F.)) it stops work! The theads are evidently activated but never ends. The same problem reported another user, you can find his report in issue section on your GitHub page…
    So will You so good as to hint me where to find a reson for this?

Leave a Reply to Joel Leach Cancel reply

Your email address will not be published. Required fields are marked *