Forum Index
HomeZBasic Home   Forum RulesForum Rules   Forum FAQForum FAQ   MemberlistMemberlist   UsergroupsUsergroups   RSS FeedRSS Feed
Site SearchSite Search   LinksLinks   DownloadDownload   Digests and SubscriptionsDigests and Subscriptions
ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in   RegisterRegister
Latency for servicing interrupts
Goto page 1, 2  Next
 
Post new topic   Reply to topic    Forum Index -> ZBasic Native Mode
Author Message
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 30 July 2008, 1:55 AM    Post subject: Latency for servicing interrupts Reply with quote

This was posted recently in the forum, but despite searching I can't seem to locate the post.
Assuming interrupts are not disabled (by blocking functions, etc) and that no other interrupt is already being called, what's the maximum guaranteed delay for servicing interrupts?

[The number I remember is 25us.]
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 30 July 2008, 15:18 PM    Post subject: Re: Latency for servicing interrupts Reply with quote

pjc30943 wrote:
what's the maximum guaranteed delay for servicing interrupts?
If interrupts are enabled and no other interrupt is pending, a new interrupt will be serviced immediately. It takes 4 CPU cycles to save the current program counter address on the stack and then 3 more to jump to the ISR indirectly through the interrupt vector. At that point, the processor is executing the code of your ISR.

If the ISR is written in ZBasic, there will be a sequence of prologue code automatically generated by the compiler to save registers used in the ISR. This could range from zero (in rare cases) to 67 cycles (also rare).

It is generally advisable to do as little as possible in an ISR. Avoiding the use of ZBasic routines (including ZBasic Library routines) will reduce the size of the prologue code and increase execution speed. Writing the ISR in assembly language will provide the most control and the best opportunity for hand optimization but it is also the most tedious and error-prone method.
Back to top
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 30 July 2008, 20:01 PM    Post subject: Reply with quote

Okay, that clarifies a few things.
But it also raises a few questions: I was under the impression that the compiler translates code directly into assembly. As an example, the following ISR takes about 25us to execute. With only the pin states being changed, it's 12us (which seems a bit long). Latency with this one interrupt [and an exited Main() ] is about 18us, and varies by about half that amount.


Code:
ISR INT4()
   'ISR for thighR encoder
   static timerStart as unsignedinteger
   static timerEnd as unsignedinteger
   
   putpin loopTogglePin, zxOutputHigh
   if ( getpin(thighEncoderR) = activeState(thighEncoderR) ) then
      timerStart = register.TCNT5
   else
      thighEncoderRPulse = register.TCNT5 - timerStart
   end if
   putpin loopTogglePin, zxOutputLow
End ISR


This is just about as bare as it can get using zbasic; would assembly be faster?

As you probably recall, there are four ISRs as above running, monitoring 250Hz encoder PWM outputs.
The case I asked about in the original post was, of course, with no other interrupts enabled. For the case of multiple interrupts, in an attempt to reduce latency (and thus the accuracy of pulse measurements if all four interrupts are enabled), the code should execute faster... Though perhaps it's already at the minimum.
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 30 July 2008, 23:02 PM    Post subject: Reply with quote

pjc30943 wrote:
I was under the impression that the compiler translates code directly into assembly.
Actually, for native mode devices the compiler translates the ZBasic code into equivalent C code which is then compiled and linked in the back-end process.

pjc30943 wrote:
the [...] ISR takes about 25us to execute
Because the ISR calls other routines (PutPin(), GetPin() and possibly activeState()), the ISR must begin with some prologue code that saves registers that might be used by those routines and initializes certain registers that must be in a specific state. The prologue code generated for the ISR that you specified is shown below. The prologue code takes 34 CPU cycles to execute. The epilogue, which performs a complementary function takes 31 CPU cycles. Together, these take about 4.4uS to execute.
Code:
push    r1
push    r0
in      r0, 0x3f
push    r0
eor     r1, r1
push    r18
push    r19
push    r20
push    r21
push    r22
push    r23
push    r24
push    r25
push    r26
push    r27
push    r30
push    r31
The first five instructions of the prologue code saves the current value of the status register and sets register r1 to zero as it is expected to be by called routines. The remainder of the prologue code saves registers r18-r27 and r30, r31 because called routines are allowed to modify any of those registers. In fact, the called routines may actually only modify a few of those registers (perhaps none) but the compiler has no knowledge of actual register usage for any ZBasic Library routines nor for any routines contained in other modules.

pjc30943 wrote:
This is just about as bare as it can get using zbasic; would assembly be faster?
Probably so, but only if you can avoid calling other routines. I would need to know several things in order to write an equivalent ISR in assembly language.
  • the pin number represented by thighEncoderR
  • the pin number represented by loopTogglePin
  • the internal workings of activeState(), assuming it is a function
Back to top
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 31 July 2008, 1:16 AM    Post subject: Reply with quote

Thanks for the explanation. So routines are not replaced by their equivalent code inline into the ISR (not that there's a particular reason to believe they would be), but called.

Does a public example exist of a naked ISR that has the minimum required stack stored only, assuming no library calls and only assembly?

dkinzer wrote:
the pin number represented by thighEncoderR
the pin number represented by loopTogglePin
the internal workings of activeState(), assuming it is a function


ThighEncoderR is E.4, loopTogglePin is J.0. ActiveState is just an array of 1 or 0, with it being 1 in this case. It'd be interesting to measure the decrease (if any) in the speed.
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 31 July 2008, 4:54 AM    Post subject: Reply with quote

pjc30943 wrote:
Does a public example exist of a naked ISR that has the minimum required stack stored only, assuming no library calls and only assembly?
As yet, no. I created a separate .S file with an AVR assembly language implementation of an ISR for INT4 (see the attached file) that I believe does what you need. It is untested, however.

pjc30943 wrote:
ActiveState is just an array of 1 or 0, with it being 1 in this case.
I don't understand how that works being indexed with a pin descriptor value like E.4 unless the array is 256 bytes in length. The compiler converts E.4 to &Ha4 - a value that has the port number and bit number encoded in it.

If the active state is constant, the ISR could be coded in a faster fashion than if it is a run-time variable. The attached file assumes that it is variable.

The assembly language ISR uses three variables that need to be defined in another module thusly:
Code:
Public thighEncoderRPulse as UnsignedInteger Attribute(Used)
Public timerStart4 as UnsignedInteger Attribute(Used)
Public activeState4 as Byte Attribute(Used)
The extra attribute is needed to tell the compiler to treat the variables as if they are used even if it can't see where they are used. Without that, they variables would be eliminated if there aren't any references to them in ZBasic code.

The ISR uses two macros to simplify reading and writing I/O ports called inPort and outPort. The issue is that I/O ports from 0 through &H3f can be accessed using the in and out instructions while the higher addressed I/O ports have to be read and written using the longer lds and sts instructions. To complicate matters further, a particular I/O port may be in the low block on one AVR device but in the high block on another. Using these macros takes out all of the guesswork and makes the code portable from one device to another without changes.

The code that I wrote only uses 3 registers (which I've given the names temp, tmpLo and tmpHi using the #define directive) and it doesn't call any other routines so the prologue code looks like this:
Code:
    // save the registers used in the ISR
    push    temp
    inPort  temp, SREG
    push    temp
    push    tmpLo
    push    tmpHi

Next, the code to set the loop toggle pin looks like this:
Code:
    // set the loop toggle pin high
    inPort  tmpLo, togPort
    ori     tmpLo, togMask
    outPort togPort, tmpLo

The identifiers togPort and togMask are defined to be the port name and corresponding bit mask for the I/O line being used.

Next, the code creates a bitmask for determining if the encoder input is in the active state. As mentioned earlier, this could be simplified if this were a compile-time constant.
Code:
    // get the active state as a bit corresponding to the encoder input
    lds     tmpLo, zv_activeState4
    ldi     tmpHi, encMask
    tst     tmpLo
    brne    1f
    eor     tmpHi, tmpHi
1:
At this point, the tmpHi register contains either zero or a single bit in the same position as the encoder input in the I/O port. The next sequence of code reads the I/O port to which the encoder is connected, masks out all but one bit and then compares the result with the previoiusly created mask in tmpHi. If they match, the encoder input is in the active state.
Code:
    // get the encoder input state, compare to the active state
    inPort  tmpLo, encPin
    andi    tmpLo, encMask
    cp      tmpLo, tmpHi
Next, the current state of the timer is read. Note that this operation does not affect the status bits generated by the comparison instruction. This fact allows the conditional branch to be done after the counter is read eliminating the need to either use two additional registers or to duplicate the timer reading code.
Code:
    // read the counter state
    inPort  tmpLo, TCNT5L
    inPort  tmpHi, TCNT5H

    // handle the active and inactive states
    brne    inactive
The action for the active state is simply to store the counter value just read and then jump over the inactive state code:
Code:
active:
    // the encoder input is in the active state, store the timer value
    sts     zv_timerStart4 + 0, tmpLo
    sts     zv_timerStart4 + 1, tmpHi
    rjmp    done
The action for the inactive state is to subtract the previously saved timer value from the current timer value and store the result. Note that the same register is re-used for the two bytes of the previous timer value. This reduces the number of registers that need to be saved/restored and has no effect on speed or code size.
Code:
inactive:
    // the encoder input is in the inactive state, compute the pulse width
    lds     temp, zv_timerStart4 + 0
    sub     tmpLo, temp
    lds     temp, zv_timerStart4 + 1
    sbc     tmpHi, temp
    sts     zv_thighEncoderRPulse + 0, tmpLo
    sts     zv_thighEncoderRPulse + 1, tmpHi
The next step is to set the loop toggle pin back low again.
Code:
done:
    // set the loop toggle pin low
    inPort  tmpLo, togPort
    andi    tmpLo, ~togMask
    outPort togPort, tmpLo
And, finally, the registers are restored and a return from interrupt is performed.
Code:
// restore registers
    pop     tmpHi
    pop     tmpLo
    pop     temp
    outPort SREG, temp
    pop     temp

    reti



int4.zip
 Description:
Assembly language ISR example.

Download
 Filename:  int4.zip
 Filesize:  1010 Bytes
 Downloaded:  2615 Time(s)

Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 31 July 2008, 14:51 PM    Post subject: Reply with quote

Another alternative is to carefully code the ISR in ZBasic, avoiding any calls. The code below should be faster than your original but not quite as fast as the assembly language version.
Code:
Public thighEncoderRPulse as UnsignedInteger
Private activeState4 as Byte

ISR INT4()
   ' ISR for thighR encoder
   Const loopToggleBitMask as Byte = &H01
   Const encoderBitMask as Byte = &H10

   Static timerStart as UnsignedInteger
   
   ' set the loop toggle pin high
   Register.PortJ = Register.PortJ Or loopToggleBitMask

   ' read the encoder input state
   If ((Register.PinE And encoderBitMask) = activeState4) Then
      timerStart = Register.TCNT5
   Else
      thighEncoderRPulse = Register.TCNT5 - timerStart
   End If

   ' set the loop toggle pin low
   Register.PortJ = Register.PortJ And Not loopToggleBitMask
End ISR


The code above could be made slightly faster if the active state doesn't need to be a variable.
Code:
   Const activeState as Byte = encoderBitMask

   ' read the encoder input state
   If ((Register.PinE And encoderBitMask) = activeState) Then
Back to top
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 31 July 2008, 20:09 PM    Post subject: Reply with quote

Quote:
I don't understand how that works being indexed with a pin descriptor value like E.4 unless the array is 256 bytes in length. The compiler converts E.4 to &Ha4 - a value that has the port number and bit number encoded in it.


Yes, exactly. It's a long array so that addressing array pins is easily done by just indexing the pin name. Of course if RAM were more limited, this wouldn't be a wise approach.

Quote:
The code below should be faster than your original but not quite as fast as the assembly language version.


This is much faster: about 2us vs. 25. That's a large improvment obviously...!

Quote:
I created a separate .S file with an AVR assembly language implementation of an ISR for INT4


Thanks Don. The comments are clear, with the exception of how you decide which registers to use for manipulations (I've only coded in microchip assembly, which is more straightforward it seems)
Silly question: is the .S file linked with just a #include? How is it used?
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 31 July 2008, 21:27 PM    Post subject: Reply with quote

pjc30943 wrote:
This is much faster: about 2us vs. 25.
Is that for the simplified ZBasic-coded version?

pjc30943 wrote:
The comments are clear, with the exception of how you decide which registers to use for manipulations
If you want to set an output on port x you refer to PORTx. If you want to read an input on port x you refer to PINx. Normally, you would also have to set the DDRx register as well but in this case I wrote the code to assume that the I/O lines were already configured properly.

For more information on how the three registers associated with each port are used, refer to the corresponding section in the Atmel datasheet for the underlying AVR chip. Most of the AVR chips used for ZX devices (all except the mega32 and the mega128) have a special feature where you can toggle the state of an output pin by writing a 1 to the corresponding position in the PINx register. Odd as this seems, it is useful for reducing execution time. Using this feature, the code for managing the loop toggle pin can be replaced with:
Code:
   ' set the loop toggle pin high
   Register.PinJ = loopToggleBitMask

   [other code]

   ' set the loop toggle pin low
   Register.PinJ = loopToggleBitMask
With this change, the ISR would complete execution in four fewer cycles overall, saving 270nS.

pjc30943 wrote:
is the .S file linked with just a #include? How is it used?
As mentioned in Section 4.2 of the ZBasic Reference manual, you can include special files in your project file that don't contain ZBasic source code. Just add them to your .pjt file. The compiler recognizes the special extensions shown in the table below and handles them differently according to their type.

Ext.Description
.oAVR object code file
.SAVR assembly language file
.cC source file
.aAVR object code archive (library)

When the ZBasic compiler encounters files with these special extensions in the .pjt file, it skips over them during the initial compilation phase. Then, when the back-end build is done, it adds them to the build configuration file (makefile) in different ways depending on the type so that they are incorporated correctly in the back-end build.
Back to top
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 31 July 2008, 21:49 PM    Post subject: Reply with quote

Quote:
Is that for the simplified ZBasic-coded version?


Yes.

Quote:
The comments are clear, with the exception of how you decide which registers to use for manipulations


Sorry for not being clear: I was referring to the three registers r0, r24, r25.
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 31 July 2008, 22:42 PM    Post subject: Reply with quote

pjc30943 wrote:
Sorry for not being clear: I was referring to the three registers r0, r24, r25.
To a certain extent, it doesn't matter which registers you choose since their previous state is saved before and restored after their use. That said, the 32 "general purpose" registers do have some different characteristics that may cause you to choose one over another. For example, there are several "immediate operand" instructions like ldi, ori, andi, subi that are only available for registers r16-r31. Since I wanted to use some of those immediate instructions, my choice was restricted to that range. Additionally, registers r24-r31 can be used in pairs with special instructions like adiw and sbiw. Since I didn't use any of those instructions, that aspect didn't affect the choice. Lastly, registers r26-r31 can be used in pairs for special indexed addressing modes. Here again, I didn't use any of those instructions.

More information is available in the AVR Instruction Set Document.

There are other considerations if you are writing an assembly language routine that will be called by C routines (or ZBasic routines) or vice versa (none of which apply to the task of writing an ISR). The protocol is that any called function may modify registers r0, r18-r27, r30, r31 and the status register. All other registers must be saved/restored if they are used. Additionally, register r1 is assumed to always contain zero. There is also a protocol describing how parameters are passed to routines. All of this is described in the WinAVR avr-libc documentation.
Back to top
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 01 August 2008, 0:20 AM    Post subject: Reply with quote

That answers the question; thanks for elaborating.


The following is displayed after adding int4.s to the project file, which is odd consider that spaces don't exist.

Code:
Error: assembler source file name "C:\Program Files\ZBasic\int4.S" cannot contain a space character
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 01 August 2008, 1:03 AM    Post subject: Reply with quote

pjc30943 wrote:
which is odd consider that spaces don't exist.
The issue is that none of the elements of the file's complete path name can contain a space. In this case, the offending element is "Program Files".

The reason for this restriction is that the tools used for the back-end processing do not support spaces in pathnames or filenames.
Back to top
pjc30943



Joined: 02 Dec 2005
Posts: 220

Posted: 01 August 2008, 1:16 AM    Post subject: Reply with quote

Okay, it complies now fine after changing the file location as directed.
The only change to the .s file was a portE that was supposed to be a portJ. The rest of the code functions.

Interestingly, the speed is about the same as that of the simplified zbasic code; both are around 2us. Any difference is too small to notice for this example... For that reason the zbasic version will be easier to work with, and for no (or little) loss in speed.

EDIT: Upon closer examination, for this specific example it seems the zbasic version is slightly faster than the assembly version, by about 10-15%.
Back to top
sturgessb



Joined: 25 Apr 2008
Posts: 246
Location: Norwich, UK

Posted: 11 January 2009, 17:15 PM    Post subject: Reply with quote

Don. I see in this thread you mention that using a const or a hardcoded value is faster than a variable. What about a standard Integer vs an Integer Array value?

ie what would be faster...

Code:

variable1 = 1234


or

Code:

variable(1) = 1234


And if so does the same apply to reading the variable as well as writing to it?

Cheers

Ben[/code]
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Forum Index -> ZBasic Native Mode Time synchro. with the server - Timezone/DST with your computer
Goto page 1, 2  Next
Page 1 of 2

 


All content Copyright © 2005-2012 Elba Corp. All Rights Reserved.
Opinions expressed in posts are those of the author and not necessarily those of Elba Corp.
Powered by phpBB © 2001, 2005 phpBB Group