Author Topic: Dual SH2 usage example  (Read 2030 times)

Danny

  • Newbie
  • *
  • Posts: 32
  • Karma: +3/-0
    • View Profile
Dual SH2 usage example
« on: April 09, 2017, 04:07:09 pm »
Hi dear friends!
After much digging around I figured out how to use the 2 SH2 CPUs, its rather simple actually but a light read on fork-join model is recommended to help grasp the concept.
Also, the Sega Saturn ram bus system does not have a bus snoop function so accessing shared stuff on the ram with both CPUs should be done carefully, either by doing a cache-through read (avoiding the CPU cache) or invalidating the whole CPU cache so you have fresh un-cached data available.
The source code for this "marvel" is in attachment, have fun and feel free to shoot any questions you have!

Note: Jo, I used 2 SGL function you might want to wrap around on your library:
- slSlaveFunc(void * pointer_to_function, void* pointer_to_function_parameter):
You can pass a single parameter to the function you are using but it will only work for the first so you can't use functions with more than one parameter, a work around for this is creating a struct and sending a pointer to the struct :P.
- slCashPurge():
This clears the Cache, I find it really funny how they mistook cash(Money) with Cache, probably Japanese translation errors?
« Last Edit: April 09, 2017, 04:26:36 pm by Danny »

XL2

  • Sr. Member
  • ****
  • Posts: 341
  • Karma: +72/-1
    • View Profile
Re: Dual SH2 usage example
« Reply #1 on: April 09, 2017, 09:11:11 pm »
Amazing! Thanks for sharing this (with explanations too, it's going to be helpful!)

mindslight

  • Administrator
  • Full Member
  • *****
  • Posts: 157
  • Karma: +6/-1
    • View Profile
    • Jo Engine
Re: Dual SH2 usage example
« Reply #2 on: April 09, 2017, 11:50:02 pm »
Thanks Danny,

I'll test that :)

mindslight

  • Administrator
  • Full Member
  • *****
  • Posts: 157
  • Karma: +6/-1
    • View Profile
    • Jo Engine
Re: Dual SH2 usage example
« Reply #3 on: April 11, 2017, 04:37:43 pm »
I will make an abstraction on that on the next release  :)

Danny

  • Newbie
  • *
  • Posts: 32
  • Karma: +3/-0
    • View Profile
Re: Dual SH2 usage example
« Reply #4 on: April 12, 2017, 04:14:49 pm »
Thanks  :) I hope this information will be useful form someone!
I have done parallel programming with POSIX threads and always wondered how that was done on the Saturn.
I have read in several places that the bus being shared was a big disadvantage because the ram can not be accessed at the same time, however this is the case for most architectures, that's why there are several levels of CPU cache to deal with concurrent access and speed things up.

The Saturn is not entirely crippled in this regard, its not like one CPU really has to wait for the other all the time because both of them have 4KB of local cache.
After researching the documentation and the forums for the correct way to use the slave CPU I decided to make a simple test program that had concurrent access to the ram (although not the same addresses).
By splitting an array in two and process a half on each CPU we have a test case for this scenario. This is a very academical example that might not demonstrate real world usage, there is also the question of accuracy of the emulator on emulating shared bus access and the reliability of using clock ticks to measure time. However if we can thrust these measurements the results seem very positive, in the test program using the two CPUs resulted in practically taking half the time to do the same thing! I will burn a CD this weekend and I'm hopping the same thing happens on a real Saturn.

It is also important to mention that even if this proves to be true that does not mean it is easy to parallelise everything, not all algorithms can be parallelised and synchronizing both CPU's makes a big impact on performance, if it is necessary to constantly access the same variable by both CPUS we will have to do a lot of busy waiting, burning CPU time and wasting bus bandwidth.
The parallelization should be done in a way that both CPUs do most of the work on their local memory only having to synchronize the work on the end, basically following the Fork-Join model.

I'm sorry guys, this post is getting really long :o, I'm going to mention just a couple more things I forgot on my firts post:
- My test program works fine on SSF but not on Yabause, I'm not sure if it has to do with Yabause not emulating some things correctly but I will verify this on a real Saturn soon.
- There are some other limitations you should take into account while using the slave CPU, I will just copy paste the part that is most relevant about this from the SGL FAQ text file where I got it from:
Quote
2-5 Cautions When Using the slSlaveFunc Function

Make sure that the functions executed by the slSlaveFunc do not overwrite the
master CPU's variables.  If the variables need to be rewritten, purge the
cache on the master CPU side.  In addition, do not execute functions that
issue functions to the slave CPU (such as those related to sprite control).
There are other interesting stuff in there so I put the text file in attachment if you want to check it out ;).
« Last Edit: April 12, 2017, 06:04:37 pm by Danny »

corvusd

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +8/-0
    • View Profile
    • Personal Web Portfolio
Re: Dual SH2 usage example
« Reply #5 on: April 13, 2017, 12:08:47 am »
Amazing great feature improve!!!

Next stop: SCU and SH1 to give extra powa to saturn 3D transform???
David Gámiz Jiménez

 

Sitemap 1 2 3 4 5 6 7 8 9 10 
SMF spam blocked by CleanTalk