Saturday, November 26, 2011

Noise problems solved

Back in October I wrote about some phantom limit and e-stop triggers. The noise produced by the spindle motor is proportional to the speed. I wrote a simple program to monitor the limit and e-stop lines and turn on LEDs in response to state changes. With no debouncing on the lines, it was easy to see the effect of turning up the speed on the spindle on the noise.

The wiring for limit and e-stop is in two parts. Both use a standard 4-wire telephone line (the line that connects the phone to the wall jack). One part connects all the limit switches in series. The other part runs from the control panel circuit board to the stepper driver box. Two lines are the switches, the other 2 are the out and back of the stepper opto-circuit power.

All the other lines running from the control panel board to the buttons and LEDs are twisted pair bell wire. So I thought I'd try more of that. I ran the same stuff through all the limit switches. Since the 2 signal lines running to the stepper box were not "out and back" of the same signal, I thought the twisted pair might not solve the problem. So I picked up 50 feet of shielded microphone wire at Radio Shack and used it for the 2 signal lines.

Upon rerunning my noise test program I was able to run the spindle at full speed with no phantom limit or e-stop firing. Yay! I'm actually pretty happy with the Radio Shack wire. I'll probably redo the limit switches again with it.

The other problem I'm having is with thermal shutdown of the spindle motor. My original theory was that the thermal switch was faulty. The time to shutdown is proportional to speed. Under no load, 1500RPM leads to shutdown in about 30 minutes. At 2900RPM it's less than 20 minutes. I bypassed the thermal switch and connected it to a light bulb and battery so I could see when it was opening but not have it shut down. I ran the motor at 2900RPM past the point of the thermal switch opening. At the time of thermal shutdown the motor is warm to the touch but not uncomfortable. A few minutes past and it is noticeably warmer, I would say uncomfortably warm but I could still keep my hand on it.

I think that kills the theory of the bad thermal switch. I've talked to Sherline about it and they want me to send the whole motor and speed control assembly back to them. I'm not happy about that. It might be only a week or two delay, but it might cost close to $100, $100 I don't really feel like spending on the project right now.

It's possible the motor has always worked this way, I've just never run it long enough to suspect a problem, though I have experienced a couple thermal shutdowns in the past.

Sunday, November 13, 2011

Improved Stepping Algorithm

It would be good to re-read the post on the original approach to movement to get a good understanding of what's to follow.

This is a follow-up to yesterday's post about the resonance problems I was having. I concluded that I had 2 possible solutions to the problem, change the inter-step timing or adjust the vector so the "long" axis always steps when the other(s) step. I thought I'd do this in the host software by scaling up the values so we had common denominators or power factors or fractal deconversions or something. I thought that this might result in Really Large Numbers that would pose a problem for limited resources of the microcontroller. I briefly considered some sort of mathematical simplification of the vectors but thought I might lose accuracy. It seemed like I could produce something that would be good enough but it didn't seem like it was going to be easy.

I'll try to give a better idea of the problem. Referring to my earlier post and this illustration:


If we ignore the Z axis to simplify things, consider a vector of X:3 Y:5. This is actually a rise over run of 3 over 5. The inversion of what it sounds like, the lower number steps more frequently thus moves farther. First we take the lowest value and subtract it from all axes. This takes us from 3:5 to 0:2. The X axis is at zero, so we step it and reset it to 3. Then we subtract 2 from both leaving us with 1:0. Y is 0 so we step it. It continues like this:

X Y
3 5
0 2 X
1 0 Y
0 4 X
0 1 X
2 0 Y
0 3 X
0 0 XY

This can be illustrated like this, a time line of steps left to right. Red is X, blue is Y:

The top line is the original implementation. Given constant speed, the time between each number is the same. So the time between X steps varies as it does for Y. For low speeds or ratios where the variations are frequent enough, this doesn't seem to be a problem. The vector that was failing had a ratio of greater than 1:22. At full speed this produced resonance. I can't really say why, I just seem to have an intuitive understanding of it. It seems a good solution is to adjust the timing between the events, that way both axes could have regular timing. I gave it shot, it was fairly easy to code. But it didn't work. I'm working in integers and there isn't enough resolution in the timing to make this work. That's the way it seems, I'm not yet positive it CAN'T be done this way.

I went kite flying with my father and my son this afternoon. I was pondering the problem when an elegant solution presented itself. Rather than step an axis when the counter gets to zero, if I step all axes that are less than what was the smallest value prior to the subtraction, then instead of resetting the stepped axes to the original value, I add the original value back in. This produces the second line in the picture above. The X axis is the longest axis and it steps with every loop. It appears in this example that Y also steps at regular intervals, but another cycle through will show that Y has back-to-back steps from 5 to 6.

It was very easy to make this change to the code and it works very nicely. The problem vector is now perfectly smooth at full speed.

I ran a test path with a ball-point pen in the chuck traced on paper. The run took 4 minutes. Just as it finished the last part of the trace I got a phantom limit switch trigger. The spindle motor was not running. The fan in the controller box was, so that's a possible source. It's also possible my cell phone is a considerable source of noise.

I also forgot to mention a failure from last week. I was running a long test, about 30 minutes. Near the end the spindle motor shut down. It has a built-in thermal breaker. I wasn't cutting anything so no loss. I was running at 2000 RPM. I need to look at the brushes on the motor and see if they need replacing. If they look good I'll run a test at 1500 RPM and see how long the motor runs before a thermal shutdown.

I'm going to look into rewiring my limit switches with a shielded cable as well.

Saturday, November 12, 2011

Binary Protocol Deployed

A month ago I thought I was close to routing my first circuit board. I had some problems with the running of the toolpath. I thought the general problem was noise and I may have been right, I still have more data to collect on that. I found that the spindle motor produces enough noise above 2000 RPM to interfere with things, particularly the E-stop. I was running at full speed, 2500+. Staying under 2000 helped a lot.

Suspecting bad communications, I rewrote the serial protocol. The packet design, bit shifting and checksum all went well. Getting the 2 "nodes" to carry on a dialog proved to be quite a challenge. Clear exchanges were no problem, correcting errors was. A low error rate was easy to deal with. My protocol does not send an Ack, just a Nak. I decided to leave the Ack to the business layer. Higher error rates with back-to-back errors turned out to be difficult to deal with.

Imagine I'm in the back seat giving you driving directions. I say "turn left", you turn left. If you don't understand me you say "say again". Then I repeat what I said. But if I say "turn left", you say "say again" but I don't understand that, I say "say again". The last thing you said was "say again" so you say it again. the last thing I said was "say again" and we are in an active lock.

Another scenario results in you hearing "turn left" twice and we go down the wrong street. I managed to fix the first scenario but not the second. I set up 2 Java threads talking to each other with random error injection. A few errors in a thousand was no problem. 5% errors led to duplicate messages. Around 3% errors the code could handle. I suspect the real error rate will be much, much lower so it should be good enough.

I dropped the new binary protocol into the firmware and ran my circuit board tool path. In addition to the noise problems I had some "transport" problems, meaning the machine didn't move the prescribed distance. Some of the moves seemed rough and there was some stalling too (resonance?). I don't mean the motors weren't strong enough, there's like no resistance here. This is a problem with pulse timing. The motors need to get pulses with timing that is appropriate for the current speed of the motor. I've written about that before.

The question is why is this coming up now after all the programmed paths I've run? The toolpaths generated by the various software tools I've been using that convert SVG and other sources to Gcode have been producing various conditions that reveal flaws in my code. The paths I've been generating from Java are relatively simple. This is the vector that was running rough at best, stalling at worst:


X:1203 Y:260741 Z:0

Since Z is zero, we're not moving vertically. I'm using a Bresenham algorithm (I think) to move the axes in some ratio. Above, the axes move in something like a ratio of 1203:260741. I say "something like" because it's actually the X axis that is longer. I've written about this before, too. I subtract 1203 from both values, X is then zero so I step it and reset. I continue this until the Y value is zero and I step it. If the machine is moving at a constant velocity, the time between pulses is fixed. Each axis that moves should step at a regular interval. With my implementation, I subtract the smallest value from all until one or more is zero. That's X most of the time until Y gets down to 893. Then I subtract 893 from both. Y is now zero but X is 310 so I step Y. Then I subtract 310 from both and step X again. So X and Y are not stepping at the same time. I don't think this is a problem, the problem is that the time between steps is fixed which means we waited twice as long as we were supposed to between X steps while we were stepping Y. At least that's my theory. I changed the Y value to 260744 which is evenly divisible by 1203. This resulted in much smoother motion and no stalling.

At some point early in the process (almost a year ago) I ran across this post by Thomas. He touches on the Bresenham algorithm and some more advanced acceleration algorithms. In the case he gives, a ratio of 3:2 is smooth enough. A ratio of 3:8 should result in the same problem I have above. Thomas talks some serious stuff about stepper motor modeling. I'm still feeling positive that I can keep things pretty simple and still get good performance. 

I'm not quite sure what approach to take to solve this. I could tweak things in the host software so that the axes ratios are such that the longest axis steps every loop. I might lose some accuracy, though. Another idea is to use the axes ratios to drive the timing between steps. I can't really think of a coherent way to describe what I'm thinking off. I'll try to work it out on paper and post it later.