International Symposium on System-on-Chip
SoC | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015

General

Conference

Lodging / Travel


Valid XHTML 1.1

Valid CSS!

The Role of Reconfigurability Panel Talk

SoC 2012

Chair:

Jari Nurmi

Panel speakers:

Tapani Ahonen, Luca Benini, Claudio Brunelli and Leandro Soares Indrusiak

Report by:

Farid Shamani and Leyla S. Ghazanfari

 

Questions: 

Where it can be used?

For what it can/can’t be used?

What kind of reconfigurable technology?

Strengths and weaknesses of reconfigurability

Decreasing reconfiguration time

Dynamic partial reconfiguration

How to design for reconfigurable system parts

Reconfigurable technology or programmable many-cores

 

Types of reprogrammable systems:

 

 

Figure1: S.Hauck, “The roles of FPGAs in reprogrammable systems”, Proceedings of IEEE, Volume 86, pages 615-638, April1998

 

Leandro: I have to say where reconfiguration can be used. I think we all have seen papers where almost every part of this picture have already been used and I can say reconfiguration has been explored in every part of this picture. In my particular research, I have used a little bit of reconfiguration as a functional unit inside the bus, CPU or the Coprocessor and also in interconnects. But I can tell you one of the reasons that I don’t like to use it, which is the fact that it’s really hard to evaluate.

Every time that you have something configurable, if you want to evaluate the average case and you want to have some guaranties for the worst case scenario, you have lots of  assumptions for evaluation. If you want to validate a dynamic system through simulation, you have to try many scenarios. If the system is complex, the simulation will take a long time. However, it is very complicated to create analytical model in order to have guaranties. This is the reason I tend not to use reconfigurable system unless I can have a system, which is predictable enough that can be evaluated in a reasonably fast way.

Principally, if you have a statistically scheduled reconfiguration that you know what configuration (that has been statistically scheduled) is at which period of time. This is something that I can deal with. It is easy to evaluate.

Claudio: I like reconfigurable systems. I think that they are useful although there are still some obstacles.  They  are  called  “holy  grail”  because  they  have  performance  almost  at  the  level  of  ASICs  and  are programmable almost at the level of FPGA. It is not too easy to achieve the level of ease of use and at least that is what I see as the main obstacle.

Luca: The main use of reconfigurability is where you need different functionalities and performance but you can’t afford the cost of ASICs. Reconfigurability can be used for instance to repair part of the system which is not working properly. What still needs to be demonstrated, even though many companies have tried, some succeeded but most of them failed is to use Reconfigurable logic as computing replacement and this is where it becomes very hard. I think the fundamental problem there more than the hardware problem is the programming module.

Tapani: I have to agree with Luca there; obviously fine grain reconfiguration has been a big success. Coarse grain reconfiguration went on faults on chips.  Anything that you  can’t  have  software  stack  and  tool supports although it might have commercial success but when it comes to replacing the computing logic with  coarse  grain  reconfigurable  engines,  the  supporting  tools  are  not  just  there.  There is a great promise though for coarse grain reconfigurability that you get rid of many unnecessary memory fetches and this will save a lot of energy consumption from the computational kernel but there is currently no proven way to exploit it. As long as there is this problem, I would say it is a very interesting technology but I will not invest my money on it yet.

 Jari: Thank you for the first run. Maybe you want to involve the audience as well, so any question at this point after the first standing point?

 

Question1: There are different coarse grain reconfigurable architectures, I found out that most research papers compare the coarse grain reconfigurable architectures they proposed to fine grain FPGAs. The results are that the area is smaller; the power efficiency is higher but how about comparing different coarse grain reconfigurable architectures? Is there a possibility to unify the evaluation framework between the different coarse grain reconfigurable architectures in term of performance and efficiency? Is it at least possible to unify the criteria for specific application domain?

Luca: If you want to play a fare comparison, you have to normalize across number of things. For example one of them is the memory system in this picture (Figure1). For a real complete application where you could compare different types of coarse grain reconfigurable architecture, you would have to equalize memory system in some ways. There are plenty of questions like how this architecture accesses the overall external memory or what is the off chip interface and so on. This task is really difficult to do, because at the end what really matters is the system performance.

I think it is very difficult to give a precise answer to this question, because even benchmarking on this type of architecture would depend so much on your memory interfaces. If you are doing work on reconfigurable fabric, you consider possibly wasted work of memory interfaces, but they are so important for the performance evaluation. Thus it is very difficult to say that you can make a fare comparison.

Jari: So, we should have more or less a full system to compare and that’s quite difficult.

Tapani: Exactly, what you can have is standardize benchmark, but there is no direct way to automate that benchmark. There are a lot of manual labours involved in executing this benchmark for various coarse grain architectures. So that’s quite a bit too much of a no game proposal to go for a standardize benchmark CGRA.

Claudio: So I agree with Tapani and Luca, this is a question that comes across my mind too. So, again I do not think it is really possible to compare them fairly, in the first place, and to give automate answer like A is better than B. That is just not possible, but still maybe we can still try to make some kind of analyses of different situations.

Question2: My second question: The dynamic partial reconfiguration nowadays has not much about timing involved. This makes the reconfigurable state of the art systems way behind the embedded real-time systems. Is there some way that it would help us to bind the time for dynamic partial reconfigurable systems to make it closer to real time systems?

Leandro:

I think partial reconfigurable systems are very dependent on technology and it is based on what the technology vendors will give you. I know people who were doing reverse engineering to get some idea about the timing constrains of these systems that are not fully supported. So if you want to use it in a hard real time system, don’t do it because it could not be safe, unless the partial dynamic reconfiguration support is made by the FPGA vendors to support these sorts of things. If you have not insisted, they have always given very little emphasis on this sort of fun academic exercise to support partial dynamic reconfiguration. It was never been the main selling point. I don’t know if there is any interest from the side of the vendors. But again I am not a vendor, just academically has some theory on how things are developed.

Jari: Interestingly Altera has also introduced dynamic partial reconfiguration, not only Xilinx, so that’s coming more common. But that doesn’t tell if it is good for real time system or not.

Tapani: I wouldn’t even consider using partial reconfiguration for real time system, but the real utility of technology comes from the potential to insert field testing mechanism for example on the fly or on the field. I can’t justify using it for real time applications.

Question3: I think that all of you agree that one of the problems relates to tool supports for reconfiguration. I would like to know your opinion about whether the problem is only related to the maturity of the tools or  it  is  also  related  to  the  applications  of  the  reconfiguration?  Is there any real application scenario where you really need partial reconfiguration?

Leandro: Maybe I have been over simplistic, but bottom line is that if you have plenty of resources you can have all the configurations there by just using a multiplexer. This is something that you do to manage your resources more efficiently. From the application point of view, the application might have different modes, they might have different performance requirements, they might have different timing requirements, but it doesn’t actually mean that it needs to have dynamic reconfiguration.

Jari: How about if you run out of resources?

Leandro: yes then you would need.

Jari: Obviously one problem is the long reconfiguration time or the dynamic partial reconfiguration, so there will always be the time for reconfiguration and then the time for processing. So, you miss some of your time budget. Of course there are some ways to try to circumvent that; like doing some multiple configuration controllers in order to try to hide reconfiguration latency or having some caching of the configuration. Do you believe in this kind of approaches?

Leandro: This is a typical resource allocation problem. It’s  the  trade  of  between  putting  the  time  of reconfiguration and the benefit that you get from reconfiguration

Claudio: We should refine the configuration time if we can. We can use caching and LUTs to reduce this time.

Question4: Since Altera has introduced this partial dynamic reconfigurability I think that there are some programming tools coming from industry for these architectures. Time will show us whether in future CGRAs are better than FPGAs or not. This was an issue in the 90’s that whether FPGAs can be used over DSPS or not. But after time tools became smarter and smarter for FPGAs so same thing can happen for CGRAs and smarter tools can be developed for them in the future.

Jari: Let’s ask the panel what do you think? When will we have good tools for programming this coarse grain reconfigurable array from high-level entry?

Luca: In my view, if you look at the current tool offering from the Xilinx, flow from C to hardware is pretty easy. Just if you don’t do too many tricks with counters, arithmetic units and things like that, it actually generates quite reasonable hardware and I personally don’t see this tool to be much harder. I think they are making reasonable steps but what I think they are missing is software generation tool chain, this tool chain is generating instruction streams and then the architecture is handling all the discussions about going to the memory. These abstractions today are breaking down even in the high level flow. So I think the next big standing block for real chip computing from algorithm to FPGA is the memory interface problem.

Leandro: I  have  heard  that  they  are  trying  to  do  this  for  one  configuration  but  regarding  the  dynamic configuration, there is still long way to go. About what time can do, I should say that when I was a PhD student long time ago; they were still talking about it. So it is not an easy problem that can be solved from demand. Till  now  nobody  has  come  with  a  concrete  solution  on  how  to  express  and  how  to validate the dynamic behaviour of a system.

Tapani: I believe in the end it goes down to manageability of your device family and manageability of your tool framework, because if you look at FPGAs, they are not general purpose and homogeneous in whole parts. Specifically what I see practically happening is that these mainstream devices from Xilinx and Altera will have more and more specialized clocks. The tools will support specialization in an efficient manner but expanding that to a completely arbitrary algorithm mapping to heterogeneous platform is a huge task. And I don’t believe these companies want to go to that direction.

Jari: So is there any market pull for this kind of technology? Especially is there any for dynamical reconfigurable heterogeneous platforms?

Luca: What I think is that if they succeed in making these devices C programmable or generally  software programmable, then they will open the application domain to many software programmers and software programmers know many  tricks  about  dynamic  programmability  for  instance  in  computer  vision  there  are  many opportunities to dynamically learn thing and modify it and in software we can do all these things.

Jari: More or less you should hide that there is some reconfigurable hardware, you should just program it and somehow magically it is there on the reconfigurable hardware?

Luca: You are loading a new problem for the processor.

Question5: Do you see that the general purpose processing community just response by extending the instruction set or having some accelerators and based on that, making a lot less demand for reconfigurable logics in the future?

Luca: I believe that it could be really great, if we could add some form of configurability in the instruction sets of the architecture. One of the main problems of this is that processor design is really a careful art. So if you stop to think over, we don’t know what will happen to your cycle time, architecture and so on. So, I think this is an interesting direction, which I would like to try it out. Just making the decision, we are not talking about configurable processors, but we are talking about reconfigurable processors. Processors as a unit that can be dynamically reconfigured, not like Xilinx processor where you put your design and specialized instruction set for given application.

Tapani: The problem is that you need to be able to at least emulate the same instruction and software on the same architecture.

Question6: Can tool developers like Synopsys decrease time to market by helping C programmers to develop reconfigurable hardware?

Claudio: It depends really on what you want to achieve but generally these tools should decrease time to market. It depends what kind of savings you want to get. You can’t write a see code and be sure you will get the exact hardware that you want. It won’t come for free. Most of the time you have to involve yourself with the hardware and you can’t hope too much on just the C code.

Jari: One problem would be that your C code is sequential and your reconfigurable architecture platform and peripherals are not.

Claudio: There have been lots of efforts in this respect and for sure a lot of people disagree with the subject. Tool or human being should reverse engineer the code, understand what algorithm does and extract the parallelism code and then compare them and point out which is not incorporated to the previous code.

 Question7: What is the role of the operating systems in regard with the reconfigurable architecture?

Leandro: In some many-cores you have reconfiguration at some points that the operating system kernel wouldn’t even be aware of it because the decision has been taken at lower level.

Luca: Some parts of this problem have been already dealt with quite nicely. If you look at the processor base and accelerator base architectures -just to give you very practical example- if you look at the runtime opencl, which manages as a GPU, it does already 90% of the job. In there you have all the instructions and drivers.  You  can  define  a  task  graph  where  the  tasks  are  kernels.  Kernels are data parallel and they are scheduled to the machine so they are taking processors clusters from the machine and these are being done dynamically. Scheduling these kernels on a GPU is very similar to scheduling configuration on an FPGA. So I think the answer to your question is, yes it is very important and people are working on it today. In my view how you manage a GPU today is very similar to how you manage a dynamical reconfigurable hardware. So there are already answers to this question but what is missing today  is  that  the  hardware  fabric  and  FPGA  themselves  are  not  so  friendly  to  this  kind  of  dynamic management. 

Tapani: There have been a lot of European researches also behind several projects that really try to map an arbitrary C application into DSP or FPGA. Also a small company called Blue Bee Technology that has been claiming success. But I would like to see the demonstration when it comes to any arbitrary application on an arbitrary platform. So far, I have not seen any demonstration of it. But it’s an interesting one.

Quastion8: Is software reconfigurability actually in the umbrella name of reconfigurable computing?

Leandro: I don’t think there is a well-established taxonomy for what reconfigurability really is. Different between the  use  of  programmable,  reconfigurable  and  adaptive  depends  on  how  often  you  change  your behaviour.  Is  it  done  every  cycle,  with  every  bit  stream or on  every  time  your  adaptive  control  system decides to change the mode? So it’s a matter of how often you do it? How often your dynamic behaviour changes? You can find different text books and papers that use these words in different ways.

Question9: With the discussion until now I came up with few conclusions: Do we really need reconfiguration? Yes we do need. We do need reconfiguration; because for instance in FPGA we can use it for recompression. Do we need partial reconfiguration? In my opinion, yes we need it because I want only part of my logic to change. I want the rest of the system to work as before. And of course it would be very good to involve software engineers in this field and have a tool to deliver reconfigurable HW from C code. But the thing is that they can’t do this because synthesizes tools show that they will not be optimal enough. Up to my knowledge we have not been able to convince the investors to invest on making a tool chain for CGRAs.

Jari: let’s ask the panel what would you do if you had extra million Euros for investing coarse grain reconfigurable company?

Tapani: Obviously the development tool for CGRAs will take place in one form or another. My guess is that it will happen gradually. It is not going to be a disruptive migration by some brilliant new CGRA tool acquired by some unknown company. Industrial tools will develop under market demands.

Question10: Suppose you take extra passive device for video and MPEG encoders, which have similar implementation, if we can route this similar implementation and put them on the cloud, how efficient this could be? Is this feasible or not?

Tapani: I am not a member of cloud team who would be worry about clock computing and security but it would be a massive storage for your data.

Leandro: I really want to have an embedded system in such a way that it has the lowest possible specification which satisfies the requirements with the performance cost, energy and so on. If you assume that you have such a system on the cloud and use your infinite storage for infinite processing, for every piece of computation that I can do remotely at zero cost, I have to pay the cost of communication to get the result back. If the cost of communication is less that the cost of computation, of course I can do the infinite similar sort of thread of my multi-processor as well. The major problem at the moment is the level of dependability, which is very hard to be qualified. Hence, even if the cost of communication of the results is low but the dependability of your kernel is high, it sounds like that this will be invalid in the cloud. Another problem at the moment is the reliability of this communication.

Tapani: The amount of data you need to retrieve from the cloud can be the communication bottleneck. You will face a problem if there is potentially huge amount of data to send to the cloud at the beginning, but the result is small amount of data. In this case it is beneficial if you route the communication back to your device. The future will show what would come over the cloud in the end. Working with cloud is also not energy efficient at some cases.

Claudio: Last year professor Jan Rabaey was in Tampere. In his speech he mentioned that for sure, energy consumption would be the future concern; because in his vision, massive amount of computation would have to be done in the data center. I believe that, in some cases, the bandwidth and latency are other issues. Probably in most cases latency won’t be that critical that t we cannot afford to wait even for a very short time but I’m not 100% sure about the power consumption.

Leandro: We have some example of 3D rendering for complex games. Many companies are remotely rendering videos of the games, on big farms. In these cases losing a frame of game might lead to get shot on the game, but you will not die in a real life. If such a thing is controlling a plain, I am not sure if I would fly with that plain.

Tapani: Exactly. If you have a battery power device that you want to have mobile game on it, obviously it is not going to be happened with 1 ampere of the battery. You need the computation power from the cloud to enable this type of application.

Jari: I think we are so far from where we started, so we are high up in the clouds now, I think it’s done and thanks to these gentlemen for their nice discussion.