Parallel computing is not a new technology in the computing industry. It is a technique that has been in use for more than twenty five years now to perform various computations using computers. Through use of parallel computing, it is possible to carry out both complex and simple calculations. Parallel computing involves the use of Local Area Networks (LAN) and other related computers connection to carry out computation problems.
Despite the long history of usage of parallel computing techniques, it is still faced with numerous setbacks and challenges such as development of effective parallel overheads, synchronization of threading programs, effective balancing of the data load to ensure efficient utilization of the processing units as well as granularity of subtasks.
Additionally, one of the major challenges in parallel computing that has received inadequate approach and consideration since the invention of parallel computing techniques is how to develop an efficient virtual memory for data-parallel processing in the computing systems. Therefore, I thought it wise to look at how to address this major challenge to program developers and software programmers for parallel computing systems.
In the first few paragraphs, I will review parallel computing in general after which I shall introduce an overview of the virtual memories for data-parallel computing. After the overview, it will give an elaborate explanation of the problems associated with virtual memory data-parallel computation and then finally give recommendations on how this problem can be approach. Lastly, the last paragraph contains a short conclusion on the whole essay, tackling issues right from parallel computing to virtual memory data-parallel computations.
Introduction to Parallel Computing. Parallel computing refers to a form of computing system in which various computer applications perform numerous calculations simultaneously. Parallel computing is based on the principle of breaking large problems into smaller ones which can be easily solved at the same time. Gerhard Joubert, Nagel Wolfgang and Peters Frans suggest that parallel computing may take various forms such as instruction level computing, task parallelism, bit-level parallelism and data parallelism (94). Parallel computing has been in used for two decades. It has been used with high performance computers. According to Bischof, the interest to use parallel computing has escalated over the past few years due to numerous setbacks, impedes and constrains that many users and developers of computer systems have faced when using frequency scaling techniques (265). Parallel computing has thus been seen as a relief to the users of computer computation systems.
The need to use parallel computing in carrying out computations has also been driven by recent heating of computers systems when they are over-tasked. Similarly, many computer applications have not been able to perform computations simultaneously and due to the increasing need to perform numerous tasks within a short period of time, the need to develop parallel computing systems emerged. Additionally, there has been a need to increase the efficiency and accuracy of computer computations. This also led to the emergence of parallel computing systems. The development of parallel computing systems has become the major challenge especially its design and architecture. Parallel computing often requires the use of multiple core processors to perform the various computations as required by the user.
In parallel computing, the main memory of the computer is usually shared or distributed amongst the basic processing elements. Moreover, various processing techniques have been developed to support parallel computing systems. These techniques include multi-core processing method, symmetric multiprocessing, cluster processing and distributed processing. In multi-core processing system, the computer performs the various computations by different execution or processing units that are placed on one single chip. Symmetric processing involves use of a bus connection to link multiple processors of a computers system. The bus connection is used to prevent any possibility of scaling of the processors. For effective bus connections, the processors used must be identical. On the other hand, cluster processing concerns grouping of computer systems in a less fixed manner using a connection network. In cluster processing systems, the computers do not have to be arranged in a symmetric manner. Finally, distributed processing systems are characterized by massive distribution of the processors within the system. The processors are interconnected using a network. In distributed memory systems, each system processor has its own memory and hence helps in reducing overlapping of the cache.
Most people prefer to use parallel computing systems to carry out computations because of their ability to enhance the performance level of computer systems hence generation of reliable more quickly. A good parallel computing contains applications that are run by programs developed through C++ and MATLAB among others.
Introduction to Virtual Memory for Data-Parallel Computing. Throughout the history of parallel computing, the efficiency of the computer computations and the reliability of the results have been depending on the size and speed of the processing units. However, it was later revealed that the efficiency of computer computations and the reliability of the results can be enhanced through development of virtual memory data-parallel computing systems (Cormen 162). There are numerous applications that are utilized by parallel computing programs. These applications often use huge sets of data and thus their performance speeds need to be accelerated. Some of these applications include the Blackboard systems such as Cor91 and Dav92, Genetic algorithms like Mna92, Visualization and Graphic such as Dem88 and Sal92 and Heuristic Search Algorithms.
I was motivated to a research on possibility of improving performance speeds and efficiency of parallel computing systems using virtual memory data-parallel computing after I read in local business and technology journal that most companies a greatly faced by the problem of poor and unreliable computations done by computer systems.
Overview of Virtual Memory for Data-Parallel Computing. Data processing in parallel computing systems has been the greatest challenge that most individual and organizational users face in their attempts to effectively carry out various computations using parallel computing systems. Most users have suffered from slow data processing when using parallel computing technologies. Petersen and Peter Arbenz argued that the speed of data processing in any computer system largely depends on the features of the system itself (91). In order to effectively perform its functions, a computer system must have essential features such as high processing speed which greatly depend on the rapid access memory (RAM) of the system processors, large storage space as well as availability of high performance auxiliary utilities like computation programs and anti-viruses. Furthermore, the performance of the computer system will rely on the memory speed of the central processing unit (CPU).
Acquiring high-speed computers may not be easy for most individual and organizational users. This is because of the high costs incurred in order to purchase such computers with high performance memory units. For example, the cost price of a processor chip with a rapid access memory (RAM) of one gigabyte costs approximately twenty dollars while a processor with a RAM of two gigabytes costs approximately thirty dollars (Cormen 137). In my view, the cost becomes even much higher when complex systems are required, for example, systems for workstations. Furthermore, the processors with high speed RAMs does not guarantee efficient performance. The reliability is still poor. It is for this reason that computer architects and software programmers decided to come up with new techniques for improving the virtual data processing in parallel computing.
Before proceeding into details about virtual memory data-parallel computing, I would like to elucidate that most people often confuse between virtual memory for data-parallel computing and virtual memory for sequential machines. Virtual memory for data-parallel computing often has the illusion for large space accompanied with hardware and software availability which lacks in virtual memory for sequential machines. Additionally, virtual memory for data-parallel computing is usually meant for data only and not for codes as in virtual memory for sequential machines. The virtual memory for data-parallel computing also has the ability to exploit predictable patterns of data which is accessible, a feature that is lacking in virtual memory for sequential machines. Virtual memory for data-parallel computing usually supports computation based on vectors. These vectors can be large and may have greater sizes.
Historical Background. The first versions of virtual memories for data-parallel computing systems were developed more than twenty five years ago. However, parallel computing machines that have the capability to support data-parallel programming are yet to be developed. According to Bodin, computer virtual memories were first developed by computer scientists at Manchester, England (68). A virtual memory refers to a computing technique in which the user develops an illusion that extra computing space is available but in reality the computer might have less memory space.
In the study, I used the descriptive research technique to determine the problem as well provide adequate information on how to tackle it. Furthermore, I will give recommendations on how to find possible solution to the problem.
In trying to understand the problem at hand, I used sampling techniques which entailed sampling of various organizations as well as individuals who have been using parallel computing technique in their computer systems. The actual research involved use of questionnaires that were sent to the potential correspondents through mail in addition to hand deliveries. As a researcher, I also had to remain focused to the research problem in order to ensure that the research goals and objectives were fully met. Similarly, rigorous procedures were followed during the research to ensure that that data collected and information obtained were valid and systematic. Finally, the conclusions that I have made were strongly based on evidence collected during the research study on use of virtual memories in parallel computing.
In drawing the conclusions, various assumptions were made. However, greater care was taken to ensure that important issues are not generalized. For example, I could not assume that both organizational and individual user faced similar problems when using the virtual memory data-parallel computing in their computer systems. Similarly, I tried so much not to select a very wide topic that would involve use of biased samples. I also tried not to make incorrect reporting by not changing the data that had been collected during the actual research study. This was to ensure that accuracy of the findings is maintained.
Last but not least, I also ensured that no unethical practices were deployed during the research study. For example, I avoided use of data collection techniques that certain correspondents would feel inconvenient to them such as asking personalized questions.
Elaboration of the Problem. The research problem was to determine the most appropriate ways in which virtual memory data-parallel (VMDP) computing technique could be used to enhance the performance of computer processors during parallel computing. As I said before, many users of parallel computing systems have been faced with the problem of ineffective performance of their computer systems during computations. Computer scientists have suggested that this problem can be conquered by use of virtual memory data-parallel computing. I thus based by interests on this suggesting to find out more on how this could be possible.
Proposed Solutions. In order to solve this problem, I propose to develop a Multiple Instruction Multiple Data (MIMD) control system. This will entail use of a separate end-machine that is capable of broadcasting instructions all the time to the main system processors which are arranged in a parallel manner. The MIMD Control system will help in assisting other sequential machines in the parallel computing system in carrying out their operations and functions more efficiently. The developed of a MIMD Control system will also guarantee system users locality. Locality refers to the ability of the system to carry out its functions in a localized manner. A good example of locality is element-wise operations in which the system performs operations from one or more sources of vectors to calculate the desired results (Yoon & Miroslaw 261). In my opinion, I would prefer the use of MIMD systems due its ability to execute numerous instructions on different data elements.
Furthermore, Multiple Instruction Multiple Data Control systems often allow use of algorithmic techniques in a chore-graphed method. This involves permuting operations through sorting out of target vectors during the operations. Another way to solve this problem is to develop a MIMD Control system that is capable of carrying out permutations at the source level. In my opinion, when the computer performs its computation at the source level, there are often greater chances of increased utilization of the random access memory (RAM). The VMDP computing system will also assist in performing several computations at the same time, for example, through monotonic routes and torus permutations. Through use appropriate source languages such as C++, the VMDP computing systems will also be able to invoke and execute routes that are specially designed for certain computations. It will also imply the principle of pack functions in which all target vectors are assigned specific elements at every stage of the computations.
Additionally, Yoon and Miroslaw suggest that a well designed Virtual Memory Data-Parallel has the capability to effectively balance the computation loads (84). Based on this suggestion, I would develop specific codes such as the VCODE Instructions and other functional codes that will assist in effective balancing of the computation loads. The VMDP system will also make use of modified NESL and VCODE so that it can easily carry out its permutations and computations. The VMDP system will also have the ability to detect unwanted permutations and perform special routine permutations to prevent occurrence of such permutations. The systems will as well have additional codes added to assist in performance of mesh and torus permutations.
Through effective layout of computation vectors, the new VMDP will be able to efficiently perform operations. According to Bodin, the manner and pattern of arrangement of vectors in a computer system often affects the performance of the system in carrying out parallel computing permutations (135). Based on this Bodin’s suggestion, I would develop a clearly defined vector layout which is capable of dividing the vectors into smaller bands and hence ease computations. The banded vector layout has been tested by computer engineers in the recent past and it proved very effective in handling large amounts of data. In the banded layout, the length of the vector is divided into various smaller bands for each specific element. To determine the number of system processors required for the bands, the power of each band is restricted to algorithmic power of two. For example, if we have sixty four elements in a vector, they can be divided into small bands of four elements each to get sixteen bands. By raising the number of elements in each band to power two (42) we realize that the systems will require only sixteen processors.
In addition to changing the vector layout in data-parallel computing, other techniques such as scan operations and segmented operations can also be used in improving the efficiency of virtual memory data-parallel computing. Scan operation is a computing technique which involves the use of special programs that take source vectors in the available data and manipulate it and then finally reproduce new vectors for each of the computation elements. The resultant vectors are then added to other vectors that were generated in previous elements. In segmented operations, the vectors are coupled and linked together.
Evaluation. I employed a critical evaluation method which involved thorough and precise assessment and analysis of the problem in order to determine the complete foundation of the problem at hand. The findings and final results of the research study were also put under critical scrutiny to validate and establish their appropriateness.
Conclusions and Future Work. I would conclude that previous researches on parallel computing have shown that data-parallel computations can be so expensive when virtual memories are involved. However, as I have indicated in the above suggestions, the cost of permutation can be greatly reduced if effective control systems are adopted. This can be achieved through development of special permutation at the source level of the operations. I would advice that various benefits would be generated to any organization or individual who adopts and uses virtual memory data-parallel computing technique. It is cheap and has the ability to overcome the many challenges that system users encounter when using serial computing systems. Additionally, due to the ability of parallel computing systems to overcome numerous memory constraints, I would recommend VMDP to organizational and individual users who would like to have quality computations.
Based on the research that I have carried out on parallel computing and the virtual memory usage, I would postulate that there are two directions for future works that can be taken by computer scientists and software developers to assist in enhancing computer computations. Firstly, software developers should develop systems that are capable of sorting algorithms more easily. In my view, this would help computer systems in carrying out general permutations with great ease. The most common types of software that can be used in sorting algorithms include Vitter NV92 and Nodine. Secondly, more research should be done on how to build an extended catalog of computations and permutations that may require special address. Generally, there are certain classes of special permutations that cannot be detected or carried out at the source level.
Finally, I would conclude that VMDP computing is the best way to effective overcome the challenges associated with slow system performance during parallel computations. In my view, it increases the ability of the computer programs and processing units to fully utilize the system resources and data during computations.