0

A is a 2D array, n is the matrix size and we are dealing with a square matrix. threads are number of threads the user input

#pragma omp parallel for shared(A,n,k) private(i) schedule(static) num_threads(threads)
    for(k = 0; k < n - 1; ++k) {
        // for the vectoriser
        
        for(i = k + 1; i < n; i++) {
            A[i][k] /= A[k][k];
        }

        

        for(i = k + 1; i < n; i++) {
            long long int j;
            const double Aik = A[i][k];
            
            for(j = k + 1; j < n; j++) {
            A[i][j] -= Aik * A[k][j];
            }
        }
    }

i tried using collapse but failed the error it was showing was work-sharing region may not be closely nested inside of work-sharing, ‘loop’, ‘critical’, ‘ordered’, ‘master’, explicit ‘task’ or ‘task loop’ region.

After what i though was correct, the time increased as i executed the code with more threads.

I tired using collapse. this is the output:

26:17: error: collapsed loops not perfectly nested before ‘for’
   26 |                 for(i = k + 1; i < n; i++) {

This is LU-Decomposition

3
  • 1. LU decomposition is sequential over k. You can only parallelize the i,j, loops. 2. Please declare the loop variables in the loop header, not outside. Commented Nov 25, 2022 at 0:35
  • @VictorEijkhout should i use collapse for that?
    – Rayan Ali
    Commented Nov 25, 2022 at 0:50
  • The i,j nest can probably be collapsed. Leave the Aik expression to the compiler to figure out. Commented Nov 25, 2022 at 5:17

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.