Forum Index
HomeZBasic Home   Forum RulesForum Rules   Forum FAQForum FAQ   MemberlistMemberlist   UsergroupsUsergroups   RSS FeedRSS Feed
Site SearchSite Search   LinksLinks   DownloadDownload   Digests and SubscriptionsDigests and Subscriptions
ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in   RegisterRegister
MicroMega's uFPU
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    Forum Index -> General
Author Message
victorf



Joined: 01 Jan 2006
Posts: 342
Location: Schenectady, New York

Posted: 17 November 2006, 13:41 PM    Post subject: MicroMega's uFPU Reply with quote

The MicroMega Corp's uMFPU is a floating point co-processor. The device can be seen here:
www.MicroMegaCorp.com
and are available at Sparkfun.com
www.sparkfun.com

Has anyone explored the use of one of these puppies in the ZX environment? Is it a feasible addition to a math intensive project on a ZX-24?

Any enlightenment will be appreciated.

Vic
Back to top
spamiam



Joined: 13 Nov 2005
Posts: 664

Posted: 17 November 2006, 17:43 PM    Post subject: Reply with quote

I have seen this before. I do not feel that it adds much to the functionality of the ZBasic platforms, which already support 32 bit floating point.

Wonder if it adds ANY speed for the math tasks. It might be faster if you use all 5 of its registers, and/or do repetetive calculations using an accumulator in the FPU.

But the type of FP coprocessor I would be looking for is the kind that supports 80 bit floats, and does trancendental operations too.

I once did ASM programming of the Intel 286/287 combo. It was quite nice. I have never had to do the same for later Intel chips, but I understand that the pentium lines have one on-board.

The 287 did not have registers as such, but had a stack. Some (all?)operations could be done using the top of the stack and any element below it.

At the time I was pretty good with RPN on HP calculators, and I really liked that technique. The 287 essentially used RPN.

So, unless a coprocessor offers added accuracy over the ZBasic native math, I see little benefit.......

-Tony
Back to top
stevech



Joined: 23 Feb 2006
Posts: 657

Posted: 17 November 2006, 21:04 PM    Post subject: Reply with quote

It's a PIC with FP aboard, as I recall. I don't know about the ZBasic VM, but the FP library for GCC is in pretty efficient assembly code.

I wonder if the time overhead of moving the operands and result back and forth between chips largely offsets its benefit. It's serial data, not dual-ported memory. In some very odd cases, freeing up the AVR to compute whilst the FPU runs in parallel may make sense. Seems unliikely to me.
Back to top
spamiam



Joined: 13 Nov 2005
Posts: 664

Posted: 17 November 2006, 21:53 PM    Post subject: Reply with quote

stevech wrote:
It's a PIC with FP aboard, as I recall. I don't know about the ZBasic VM, but the FP library for GCC is in pretty efficient assembly code.

I wonder if the time overhead of moving the operands and result back and forth between chips largely offsets its benefit. It's serial data, not dual-ported memory. In some very odd cases, freeing up the AVR to compute whilst the FPU runs in parallel may make sense. Seems unliikely to me.


Since Don could freely learn all the tricks and efficiencies that are to be found on the web, plus his own evidently considerable expertise, I would expect that the VM is at least as efficient as GCC.

I had not done any communication calculations, but I can't see how a PIC programmed with the FPU routines could be as fast as what ZBasic does for itself when you have to add in the communication overhead.

I agree, that in some cases, expecially if you do not have to re-load interim values, you might have some overall improvement in speed if you off-load some FP overhead and at the same time are able to do other work while waiting for the answer.

Now it it did 80 bit calculations and was used to store interim values, then it might be well worth it!

If I had access to 80 bit FP routines in GCC, I might be able to program an AVR to do the work. It might take an ATMega16 or 32, but so what?!

Hmm, I will have to look for those 80 bit math routines!

I was starting to look for a new mini-project for an AVR.

-Tony
Back to top
dkinzer
Site Admin


Joined: 03 Sep 2005
Posts: 2499
Location: Portland, OR

Posted: 17 November 2006, 22:14 PM    Post subject: Reply with quote

I did some quick tests of ZBasic FP operation times using the values pi and e. A multiplication took about 50uS, division 77uS, square root 175uS and log10 350uS. These times include reading the operand value(s) and storing the result.

As far as I can tell, the uMFPU requires sending 6 bytes to the device for each 32-bit operand plus 2 bytes for the operation command. To read the result, 2 bytes need to be sent and 4 bytes read. The time for sending and receiving alone (using the faster SPI interface) for a binary operation will be 20 bytes * 8 bits/byte * 250nS/bit = 40uS. According to the datasheet, there is an additional 15uS "read delay period" that must be observed between sending the read command and beginning to read the data. This brings the total to 55uS. There may be (probably is) an additional delay between the sending of the operation command and the operation being completed. I couldn't find any information on the operation times.

Notwithstanding the foregoing, the ability to store a "program" in the device might be useful for exploiting the benefits of parallel processing. This could work well if you had some complex computations involving a few operands - just send the operands, instruct it to begin and go do other things while it computes the result.
Back to top
stevech



Joined: 23 Feb 2006
Posts: 657

Posted: 18 November 2006, 1:28 AM    Post subject: Reply with quote

maybe some sort of multiply-and-accumulate function is where that parallelism helps. But this kind of math is not normally done on a little 8 bit micro, I speculate - except in a low budget University lab, eh?

Pretty good numbers for FP in the VM!

When dinosaurs roamed the earth, I remember a project we were doing, using a Data General Nova 800 (16 bit, core) and we bought a hugely expensive FP box for it. Like 6U rack space. It wasn't much faster! Ah, and our 5MB hard disk was a ga-zillion dollars too.
Back to top
mikep



Joined: 24 Sep 2005
Posts: 765
Location: Austin, TX

Posted: 18 November 2006, 3:15 AM    Post subject: Reply with quote

dkinzer wrote:
I did some quick tests of ZBasic FP operation times using the values pi and e. A multiplication took about 50uS, division 77uS, square root 175uS and log10 350uS. These times include reading the operand value(s) and storing the result.

Yes these results are pretty good for a cheap 8-bit microprocessor at 20000 floating point operations per second (flops). As other people have indicated there is very little gain to offloading small mathematical operations to another chip because of the relatively large communication overhead.

At the other end of the scale is the Cell Broadband Engine that has 9 separate SIMD compute cores, a blistering fast 200GB/s bus and can calculate 230400000000 flops (or 230.4 Gflops). Where can you get one of these awesome chips? Well a Sony Playstation 3 if you one of the lucky few as they were just made available today in the US or a IBM Cell-based blade (QS20 Blade Server) that contains two Cell chips. And once you network them together it is possible to create a Petaflop (1015) capable SuperComputer (see Los Alamos National Labs annoucement). And if this chip really whets your appetite then there is a Software Development Kit that includes tools and a Full System Simulator.
Back to top
spamiam



Joined: 13 Nov 2005
Posts: 664

Posted: 18 November 2006, 4:30 AM    Post subject: Reply with quote

200+G FLOP.... Seems quite good. What does a 3GHz P4 do? Less than 3G FLOPS, I presume.

-T
Back to top
mikep



Joined: 24 Sep 2005
Posts: 765
Location: Austin, TX

Posted: 18 November 2006, 6:12 AM    Post subject: Reply with quote

spamiam wrote:
200+G FLOP.... Seems quite good. What does a 3GHz P4 do? Less than 3G FLOPS, I presume.
-T

You are a master of understatement Tony Smile

Of course one benchmark isn't everything as we have said before on this forum. A Pentium Xeon is about 6 G flops and various benchmarks show a 5 to 50 times performance difference depending on the workload.

I know we are totally off topic now but I just wanted to mention some other technologies besides ZBasic.
Back to top
GTBecker



Joined: 18 Jan 2006
Posts: 457
Location: Cape Coral

Posted: 18 November 2006, 14:03 PM    Post subject: Reply with quote

dkinzer wrote:
A multiplication took about 50uS, division 77uS, square root 175uS and log10 350uS. These times include reading the operand value(s) and storing the result.


Well, I've got a uMFPU running on the bench. I don't know much about it, yet, but it is much more promising than just another 25-year-old 8087, guys.

This thing offers 128 float registers, a single-instruction 64-bin FFT that can expand in blocks to any binary bin size, 256-byte instruction queue that can be loaded while the part is busy, and is programmable; you can write, load and execute up to 64 sophistcated, decision-making functions, and can convert ASCII-to-floats and vv onbaord. It's apparent advantage over older FPUs is that it is internally large enough to keep plenty of intermediate results floats onboard and potentially never need to offload them since the part can do it's own Float-ASCII string conversions, even field searching, like $GPRMC to find lat/long, for example.

It is built on a dsPIC30F30xx. Don't know about speed yet, but the specs say a multiply takes 9uS, divide 18uS, squareroot 24uS and log10 144uS. These are worst case times if a range is expressed, but without transfer times (SPI or I2C). So, if all you need to do is a single multiply, use the VM. Wanna do a least-squares fit? The FPU might be a better place to do that.

You guys disappoint, gentlemen. The comments sound like you've leapt to critique a tool that you do not understand. Take another - or first - look, then discredit it, if you wish. After I play with it for awhile, I'll give you some facts.
Back to top
spamiam



Joined: 13 Nov 2005
Posts: 664

Posted: 18 November 2006, 14:23 PM    Post subject: Reply with quote

GTBecker wrote:
You guys disappoint, gentlemen. The comments sound like you've leapt to critique a tool that you do not understand. Take another - or first - look, then discredit it, if you wish. After I play with it for awhile, I'll give you some facts.


Well, I guess I should have looked further, but I went to their own webpage and looked gor info extolling the virtues of the device. I did find something describing the newest changes, but nothing jumped out at me detailing all the wonderful things it does the way you did. I should have looked harder, and they should have made it easier to find.

It sure does sound like a great thing. I LOVE the deep instruction register. Pretty quick too, compared to the ZX platform.

The problem is that it seems like it is wonderful for recursive calculations. I have always tried to use more precision for recursive calculations than 32 bit. From their website, it sounds like it does (at least primarily) 32 bit math, though you indicate 64 bit FFT.

I would have loved it more if it used 64 or 80 bit math.

-Tony


Last edited by spamiam on 18 November 2006, 14:27 PM; edited 1 time in total
Back to top
victorf



Joined: 01 Jan 2006
Posts: 342
Location: Schenectady, New York

Posted: 18 November 2006, 14:25 PM    Post subject: Reply with quote

Tom,

Thanks for your views on the uFPU. Smile I am seriously considering using it in my new ZBasic project. It will need some math beyond simple multiplication division stuff. I will go the SPI route.

Since you have a bit of experience with the FPU, do you have any code to share?

Any enlightenment will be appreciated.

Vic
Back to top
GTBecker



Joined: 18 Jan 2006
Posts: 457
Location: Cape Coral

Posted: 18 November 2006, 15:53 PM    Post subject: Reply with quote

spamiam wrote:
you indicate 64 bit FFT...


No, I said 64-bin FFT.
Back to top
GTBecker



Joined: 18 Jan 2006
Posts: 457
Location: Cape Coral

Posted: 18 November 2006, 18:43 PM    Post subject: Reply with quote

Vic, I haven't gotten to 1+1=2.0000000001, yet. It is encouraging, though, to see

{RESET}
>V
uM-FPU V3.0.3, I2C:00 29.48 MHz

from the onboard debug monitor. I'll probably have the equivalent of "Hello World" tonight.
Back to top
mikep



Joined: 24 Sep 2005
Posts: 765
Location: Austin, TX

Posted: 18 November 2006, 19:15 PM    Post subject: Reply with quote

GTBecker wrote:
You guys disappoint, gentlemen. The comments sound like you've leapt to critique a tool that you do not understand. Take another - or first - look, then discredit it, if you wish. After I play with it for awhile, I'll give you some facts.

I did a quick compare of the features with ZBasic. The uMFPU only supports 32-bit longs and 32-bit floating point numbers so it somewhat limited in that area. Most of its functions have ZBasic equivalents. Here are the differences I could find from reading the instruction set:
  • 64-bin FFT
  • Inverse table lookup for 32-bit floats
  • Inverse-table lookup for 32-bit longs
  • Matrix operations
  • Polynominal evaluation
  • Nth Root
All of these functions could be programmed in ZBasic without too much effort and for many applications the performance will be just fine. If there is sufficient demand I'm sure that Don could add one or more of these features to the ZVM (not ATmega32 devices).

Perhaps the most useful capability is to download programs to the FPU and have them execute in the background.


Last edited by mikep on 18 November 2006, 23:59 PM; edited 1 time in total
Back to top
Display posts from previous:   
Post new topic   Reply to topic    Forum Index -> General Time synchro. with the server - Timezone/DST with your computer
Goto page 1, 2, 3  Next
Page 1 of 3

 


All content Copyright © 2005-2012 Elba Corp. All Rights Reserved.
Opinions expressed in posts are those of the author and not necessarily those of Elba Corp.
Powered by phpBB © 2001, 2005 phpBB Group