|
If you see a global pointer (the old fashioned, 'dumb' style), there are three possibilities:
- It's a leak. Sometimes, somewhere it's allocated, but it is not always released.
- It's dangling. Since it is always is in scope, and it is not always allocated, sometimes it dangles. Even if 98% of the times it is released, it is immediately set to
NULL . - Both.
Whenever you open a source file and see a global pointer, exterminate it. Have no mercy. This will save you work and headaches down the road.
Pablo.
"Accident: An inevitable occurrence due to the action of immutable natural laws." (Ambrose Bierce, circa 1899).
modified 17-May-12 4:57am.
|
|
|
|
|
How do you handle specifications?
There are two approaches I've seen in friends and acquaintances:
- The coder uses specs as a baking recipe: the code should be a translation of the specs into a language the compiler understands.
- The programmer is an artist: specs are just another source of inspiration.
This different styles of coding match two different kinds of specification: the artist will be able to use user stories, but will require understanding of the reason why a given functionality is required, and the benefit it is supposed to bring to the people using the software.
This style of specs is (in my opinion) easier to write, and to understand by the layman.
Specs written for the coder, on the contrary, must be painfully detailed. Most QA testers I've met seem to love these, but sooner or later a real customer complains that the program does not do something obvious - and faces the answer 'it works exactly as specified'.
What is your experience?
What kind of specs do you find most useful? How strictly does your code match those specs?
I'd really like to hear.
Pablo.
"Accident: An inevitable occurrence due to the action of immutable natural laws." (Ambrose Bierce, circa 1899).
|
|
|
|
|
However detailed specifications one may write, there are always a few set of things that goes without saying i.e. are assumed to be included even if not written. So I always go by the second approach and would like to talk both the client as well as the developer to make sure I understand what the client wants (including his unsaid assumptions) and the coder understands what needs to be done. I prefer to leave a little bit of room for the coder to develop by thinking about what would a client require and correct him/her if there is a gap. A slow learning process but helps in long term as the coder starts to pick those unsaid details making specs less tedious and more of a guideline.
|
|
|
|
|
It's either quick and clean, or slow, dirty and painful.
Pablo.
"Accident: An inevitable occurrence due to the action of immutable natural laws." (Ambrose Bierce, circa 1899).
|
|
|
|
|
hai,
just a question, I run a website based on VC++ faqs i.e. VisualCpp.org. Could you please allow me to post you blog post with there with your reference and your permission.
Thanks and waiting for your reply!
"Opinions are neither right nor wrong. I cannot change your opinion. I can, however, change what influences your opinion." - David Crow Never mind - my own stupidity is the source of every "problem" - Mixture
cheers,
Alok Gupta
VC Forum Q&A :- I/ IV
Support CRY- Child Relief and You
|
|
|
|
|
Yes, by all means.
Pablo.
"Accident: An inevitable occurrence due to the action of immutable natural laws." (Ambrose Bierce, circa 1899).
|
|
|
|
|
|
Did you see the thread procedure below?
The pointer pX, which was casted, is shared among threads.
Using it outside the thread loop (after setting the 'I'm really done' event) is a big no-no, since it might have been deleted by the thread manager.
You can protect yourself by defining a scope, like this:
DWORD WINAPI ThreadProc(LPVOID pX)
{
if (pX != NULL)
{
X *realPointer = reinterpret_cast<x> pX;
::PeekMessage(whatever);
while (!done)
{
::GetMessage(....)
switch(msg.uMsg)
{
case WM_COMMAND:
realPointer->ExecuteCommand(msg.lParam);
break;
case WM_QUIT:
done = true;
break;
}
}
SetEvent(realPointer->reallyDone);
}
return 0;
}
I'm sad and ashamed to admit - this is from experience.
Happy programming,
Pablo.
"Accident: An inevitable occurrence due to the action of immutable natural laws." (Ambrose Bierce, circa 1899).
|
|
|
|
|
The word 'legacy' means different things to different people. To me, it brings associations of hounted manors...
A few weeks ago, I found myself writing a program (Windows service, listens for events and passes them on) which had to use a legacy DLL while providing decent performance.
The first thing that came to mind was multithreading. It didn't last.
That DLL does not support multithreading: it has lots of global variables, which, if you're old enough, you may remember were deemed 'efficient' by C programmers, since they don't take stack space; in the days of DOS, with 32Kb stack, that was meaningful.
Refactoring the DLL was not an option: lots of work and lots of testing, with the prospect of breaking something, and under an extremely tight budget; and someone else is in charge of maintenance of that DLL.
Then I remembered having seen somewhere threads referred to as 'lightweight processes' (in a book about Windows 95 programming), so I figured, can processes do the task?
Fortunately, the DLL exposes only one function, which performs a rather lengthy batch operation.
The solution was like this:
I wrote an EXE which takes parameters somehow (it reads them from a SQL stored procedure, but the command line or a text file could have been used as well), and passes them to the DLL.
The service manages a process pool, whose maximum size depends on the number of CPUS in the machine.
When an event is detected, it's added to the SQL table which the aforementioned procedure reads from; one of the server processes is selected from the pool; if it's necessary, CreateProcess() is called on the executable (unless it's already running).
It might seem a bit crude, but it gives above-decent performance, the service itself keeps a tiny footprint (it has no code that really 'does' stuff), and any crashes in the legacy DLL (it didn't happen yet, but it might) don't crash the service.
Also, debugging a 'regular process' (single threaded) is easier than debugging a multi-threaded service.
The moral? Sometimes heavyweight is the right weight. As much as I like threads, there are other tools available, and they should not be forgotten.
Pablo_A
|
|
|
|
|
You can find a lot of thread classes and patterns on the Web, but you won't find one which fits all your needs.
That, at least, is what happened to me.
Learnt a lot along the way, though. Here goes some of it.
The main issue is synchronization. There are many shapes and flavors, but the general idea is, the least you serialize access to stuff, the better off you are, both in respect to performance and to the risk of deadlocks.
Then again, there are several kinds of multithreading models.
The simplest one, is the 'shoot and forget' kind: print spooling is an example. You start a process running, and it will be done when it's done. Synchronization here is easy: just don't.
Then there are all the foreman/workers models, when one object (the foreman) lives in a thread, and sends jobs to one or more 'worker' objects, which execute those jobs on separate threads. Here, you can use any kind of mutex (in the broad sense of the term, which includes critical sections and the PostThreadMessage() function), and the appropiate one depends on how long each job will take, and how long the program will be 'on the air'. A Windows service must be more robust than a batch process that runs just for a few minutes.
You avoid deadlocks by locking only the message queue between the foreman and each worker (each worker has one), and for the shortest period of time. Particularly, the worker thread should lock the queue only while it's popping a job, and NOT while it's executing the job. In this design, there are no shared resources among threads but the message queue between a worker and its foreman. The Windows API PostThreadMessage() works like that.
An alternative to this model holds only one message queue for all workers. This one is practical if jobs are rather lengthy: the key question is, how often will a worker wait on the shared message queue while another worker is popping a job.
Some other resources might be shared among threads: think about an error log, if you use a text file for error logging. In that case, you need a OS-level lock (such as a named mutex, in Windows) while you're working with that file. Since errors should be relatively rare, you can get away with this; but, if you're logging, let's say, results, a more robust approach shall be needed. A database, for example.
Designing a multithreaded job
It all begins with a piece of paper, divided in several columns:
- User: Represents the user thread.
- Foreman: Gets a 'start working' message from the user thread, generates 'jobs' and divides them among the workers (or posts them to a common queue, and each available worker will take them from there). Might report progress to the user's thread.
- Workers: This column is usually divided in three: A, B, and Murphy. A and B take turns trying to work, Murphy will try to lock an object whenever it's needed by another thread. As you might have guessed, at design time Murphy is my closest collaborator.
The workers will usually not report progress, but set some state which can be queried by the foreman.
I've been using threads whenever suitable for a few years now, and haven't met a deadlock yet (I fell into a database deadlock once, but that's a totally different story). Performance is usually satisfactory, despite the cost of thread switches: those can be helped by using a LIFO approach for worker threads, if you've got several of them. Debugging is not that much of an issue, since I unit-test as much as possible (not everything, to be honest), and I keep the bussiness logic apart from the threading code: I also like traces better than breakpoints, anyway.
The bottom line: I've heard people say that multithreading in C++ is hard, error-prone and difficult. Well, those reports, in my humble opinion, are exaggerated.
It's just a tool; use it the right way for the right job, and you'll get the right result.
Pablo_A
|
|
|
|
|
Multi threading is anything but boring.
A few days ago, I was asked to improve the performance of a program, which reads a FOX Pro table, generates SQL INSERT commands, and executes them against a database.
The way to do it was clear: read in one thread, write on another. To spice things up, since inserting is much slower than reading (all that index updates, you know), three writer threads were matched to the reader.
The writers' thread procedure looked like this (pseudo-code):
<br />
DWORD WINAPI ThreadProc(LPVOID pX)<br />
{<br />
::PeekMessage(whatever);<br />
while (!done)<br />
{<br />
::GetMessage(....)<br />
switch(msg.uMsg)<br />
{<br />
case WM_COMMAND:
break;<br />
case WM_QUIT:<br />
done = true;<br />
break; <br />
<br />
}<br />
}<br />
SetEvent(reallyDone);<br />
return 0;<br />
}<br />
Nice, right?
Tested it with 16K records: runs like the wind, results are fine.
64K records: still OK.
80K records: some start to get lost.
I've use ::PostThreadMessage() for quite some time: the caller allocates a Job, posts it to the queue, the worker thread pops it from the queue, executes it, and deletes it. All synchronization tasks are taken care of by the OS, which minimizes thread switches.
Well, you always have to read the fine print. And the fine print for ::PostThreadMessage() is that the queue size is 10,000 messages. That's a lot, if you have a few dozen long-running jobs (in the past, for me, that was the usual).
So, how could I handle it?
My first idea was the use of a 'thermostat'. An integer was interlock-incremented whenever a message was posted, and interlock-decremented whenever a job was executed, thus holding the count of messages in the queue.
The manager thread would test the queue size: if longer than 900 messages, it would Sleep for 50 milliseconds: if, after that, there were less than 1000 messages, it would try to post one (in a loop); if, after 5 tries the message was not posted yet, the job would be executed on the calling thread.
I didn't like it: whole lot of interlocking going on, and Sleep is by far not my favorite API.
Well, the bad news was as expected: performance went down by 30%, but still it was three times faster than the original program.
The good news: 80K records, and none got lost. 150K, and none got lost!
So I tried for 400K (the program should be able to send about a million): well, it got stuck after 215K. Playing with the thermostat settings helped a bit, but no cigar.
After a couple of hours, I gave up. Added std::queue<Job *> to the mix, managed synchronization with a critical section (which protected this queue), and performance went back to good. Private bytes and CPU usage were totally flat, up to 400K records.
The program was ripe for the QA department.
What did I learn?
::PostThreadMessage() is a nice tool, but it has some limitations. Its usage: when you have just a few 'big' jobs, and in a program which runs, does its thing and exits. For long-running programs, or programs which send to the worker threads a lot of small jobs, you're better off rolling your own queue.
And yet, multithreading, when you know how to do it, can be a great way to make an impression, and to become the star of your company parties.
Pablo
|
|
|
|
|