|
|
| Author |
Message |
spamiam
Joined: 13 Nov 2005
Posts: 665
|
|
Posted: 21 September 2006, 13:55 PM Post subject: speed of PutPin() instruction. |
|
|
I want to toggle a pin.
Currently I use
| Code: | PutPin(SCLK,1)
PutPin(SCLK,0) |
Is there a faster technique? The above technique seems to take about 40uS, because I see a 20uS wide pulse.
Is there something that will get executed faster than this. I want to increase the bandwidth overall.
I could use PulseOut() to get a narrower pulse, but the latency of the instruction is longer (more set-up to do, I suppose), so the bandwidth is much worse.
Suggestions?
-Tony |
|
| Back to top |
|
 |
dkinzer Site Admin
Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR
|
|
Posted: 22 September 2006, 1:22 AM Post subject: Re: speed of PutPin() instruction. |
|
|
| spamiam wrote: | The above technique seems to take about 40uS, because I see a 20uS wide pulse.
Is there something that will get executed faster than this. |
The code below appears to take about 75% of the time to execute as the PutPin() method.
| Code: | Call ToggleBits(Register.PortC, &H01)
Call ToggleBits(Register.PortC, &H01) |
I confirmed that the PulseOut() method does take slightly longer than the PutPin() method, probably due to the pre- and post-operation timer manipulation.
I can imagine a new function PulsePin() that, given a pin number, would generate a fast pulse on the pin. The width of the pulse would likely be in the range of 6 CPU cycles (about 400nS) and the overall execution time would likely be about the same as a single PutPin() call. |
|
| Back to top |
|
 |
spamiam
Joined: 13 Nov 2005
Posts: 665
|
|
Posted: 22 September 2006, 20:54 PM Post subject: |
|
|
Thanks for the suggestion.
I will try it out and see how much it speeds up the overall transmission of a byte. I can make this change easily. The only slightly inconvenient thing is that I have to use two constants here to define the pin in question. The port and the bit in the port. Not hard to do, just 2 more constants that some user might need to keep track of.
25% improvement is worth it.
The idea of PulsePin(PinNumber) is a good one. I wonder if 400ns is too fast? For SPI MMC cards, it is easily within the speed ratings. But I wonder if other devices might find it just a tad too short? Maybe 1000nS? Maybe a second arg indicating a short/long pulse?
Since this would be executed with nearly speed of a single PutPin instruction, the speed of the entire s/w SPI write/read byte would be significantly increased.
After that, the only thing faster would be the same subroutine incorporated into the VM aimed at S/W SPI.
The statement might be something like this
| Code: | | ReadByte = SPIByte(MOSI,MISO,SCLK,DATA) |
-Tony |
|
| Back to top |
|
 |
spamiam
Joined: 13 Nov 2005
Posts: 665
|
|
Posted: 23 September 2006, 1:05 AM Post subject: |
|
|
OK, I tried the TogglePin() technique and it DOES speed up the bandwidth.... by 0.5%. Not much but it is meaningful.
The two TogglePin() instructions seem to result in a high pulse of about 0.4uS less than the similar PutPin() result. This suggests that I am getting 0.8uS total time savings for the two instructions for each bit in the byte
Here is my SPI byte shifter code:
| Code: | Public Function SPI_IO(ByVal SB As Byte) As Byte
'SPI_IO is not initialized and this is OK! Ignore the warning if/when you get it!
'this shifts 8 bits out and in at 0.165mS per bit, plus overhead of 0.136mS per byte
Dim I As Byte
SB = FlipBits(SB) 'flip them so that they can be sent low bit first
'to make the simple MOSI putpin() work
For I = 1 to 8 ' Shift 8 Bits
Call PutPin(MOSI,SB AND &B0000_0001)
SB = SB \ 2 ' Shift Send Byte Bit to Left
SPI_IO = SPI_IO * 2 + GetPin(MISO)
'Call PutPin(SCLK, zxOutputHigh) ' Set SCLK High
'Call PutPin(SCLK, zxOutputLow) ' Set SCLK Low
Call ToggleBits(Register.PortC, &B0000_1000)'toggle pin 9 = this improves speed by 0.5%
Call ToggleBits(Register.PortC, &B0000_1000)
Next
End Function |
Using the FlipBits() seems like an unnecessary step, but it allows me to use a simple PutPin() instruction, rather than a conditional test leading up to a PutPin(). It costs time once, and saves time 8 times.
Does anyone see any other optimizations I can make?
-Tony |
|
| Back to top |
|
 |
stevech
Joined: 23 Feb 2006
Posts: 657
|
|
Posted: 23 September 2006, 2:07 AM Post subject: |
|
|
| you could "unroll" the code - take out the for loop and repeat the code 8 times. Saves a little overhead, like maybe 15% |
|
| Back to top |
|
 |
spamiam
Joined: 13 Nov 2005
Posts: 665
|
|
Posted: 23 September 2006, 3:07 AM Post subject: |
|
|
| stevech wrote: | | you could "unroll" the code - take out the for loop and repeat the code 8 times. Saves a little overhead, like maybe 15% |
Good Point.
I will try it and see. I forgot about doing that one.
-Tony |
|
| Back to top |
|
 |
stevech
Joined: 23 Feb 2006
Posts: 657
|
|
Posted: 23 September 2006, 3:24 AM Post subject: |
|
|
| i wonder if using the CPU register accesses in ZBasic is faster for some things. |
|
| Back to top |
|
 |
pjc30943
Joined: 02 Dec 2005
Posts: 220
|
|
Posted: 23 September 2006, 3:37 AM Post subject: |
|
|
Just a minor point, but since you seem to be going for max speed at all costs...
With the VM, is * or / 2 as fast as shift bits, eg. <<? Generally shift is faster... |
|
| Back to top |
|
 |
dkinzer Site Admin
Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR
|
|
Posted: 23 September 2006, 4:22 AM Post subject: |
|
|
| pjc30943 wrote: | | With the VM, is * or / 2 as fast as shift bits |
Many such questions can be answered by perusing the listing file. For example, for the divide-by two, the code generated by the compiler is:
| Code: | SB = SB \ 2 ' Shift Send Byte Bit to Left
0034 23fbff PSHR_B bp-5
0037 1a01 PSHI_B 0x01 (1)
0039 96 SHR_B
003a 26fbff POPR_B bp-5 |
With optimization on, the compiler will choose optimizations like this when it can. Another example:
| Code: | SPI_IO = SPI_IO * 2 + GetPin(MISO)
003d 1a0b PSHI_B 0x0b (11)
003f d5 GETPIN
0040 23faff PSHR_B bp-6
0043 16 DUP_B
0044 35 ADD_B
0045 35 ADD_B
0046 26faff POPR_B bp-6 |
Here, instead of generating a shift, the compiler generated code to duplicate the item on the top of the stack and then add the two top values. Multiplying by higher powers of two will yield a shift-left instruction. |
|
| Back to top |
|
 |
spamiam
Joined: 13 Nov 2005
Posts: 665
|
|
Posted: 23 September 2006, 13:21 PM Post subject: |
|
|
| Quote: | | Many such questions can be answered by perusing the listing file | Now I have to remember how/where to put the --List instruction....
Point well taken. Admittedly, I have not been checking the listing file. But, based on previous experience with optimization of simple multiplies/divides (like by 2), I took it for granted that the compiler would give me the "optimal" operations for the desired math.
I "unrolled" the loop in the subroutine, as Steve suggested. WOW! it really helped. Steve thought that a 15% improvement might be seen.
Well, it was more like 32%!
-Tony |
|
| Back to top |
|
 |
dkinzer Site Admin
Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR
|
|
Posted: 23 September 2006, 15:30 PM Post subject: |
|
|
| spamiam wrote: | | Now I have to remember how/where to put the --List instruction.... |
Near the top of your .pjt file add a line like this:
The listing will be generated to "foo.lst". |
|
| Back to top |
|
 |
spamiam
Joined: 13 Nov 2005
Posts: 665
|
|
Posted: 23 September 2006, 16:10 PM Post subject: |
|
|
Ah, yes... the memory comes back to me when you TELL me!
I had thought there was a menu option to add lines to the project, but I hads to do it thru the regular editor.
Not a problem.
One thing is that I see stuff like Call Tog_Bits, but I do not know what that specific code looks like, so I can;t really tell if Tog_Bits will be faster than PutPin directly.
Timing 10,000 of them tells the answer, though.
Now I have the S/W technique running at least twice as fast as it was when I just slapped it together.
-Tony |
|
| Back to top |
|
 |
|