Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8003769
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T16:44:30+00:00 2026-06-04T16:44:30+00:00

I recently wanted to use a simple CUDA matrix-vector multiplication. I found a proper

  • 0

I recently wanted to use a simple CUDA matrix-vector multiplication. I found a proper function in cublas library: cublas<<>>gbmv. Here is the official documentation

But it is actually very poor, so I didn’t manage to understand what the kl and ku parameters mean. Moreover, I have no idea what stride is (it must also be provided).
There is a brief explanation of these parameters (Page 37), but it looks like I need to know something else.

A search on the internet doesn’t provide tons of useful information on this question, mostly references to different version of documentation.

So I have several questions to GPU/CUDA/cublas gurus:

  1. How do I find more understandable docs or guides about using cublas?
  2. If you know how to use this very function, couldn’t you explain me how do I use it?
  3. Maybe cublas library is somewhat extraordinary and everyone uses something more popular, better documented and so on?

Thanks a lot.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T16:44:30+00:00Added an answer on June 4, 2026 at 4:44 pm

    So BLAS (Basic Linear Algebra Subprograms) generally is an API to, as the name says, basic linear algebra routines. It includes vector-vector operations (level 1 blas routines), matrix-vector operations (level 2) and matrix-matrix operations (level 3). There is a “reference” BLAS available that implements everything correctly, but most of the time you’d use an optimized implementation for your architecture. cuBLAS is an implementation for CUDA.

    The BLAS API was so successful as an API that describes the basic operations that it’s become very widely adopted. However, (a) the names are incredibly cryptic because of architectural limitations of the day (this was 1979, and the API was defined using names of 8 characters or less to ensure it could widely compile), and (b) it is successful because it’s quite general, and so even the simplest function calls require a lot of extraneous arguments.

    Because it’s so widespread, it’s often assumed that if you’re doing numerical linear algebra, you already know the general gist of the API, so implementation manuals often leave out important details, and I think that’s what you’re running into.

    The Level 2 and 3 routines generally have function names of the form TMMOO.. where T is the numerical type of the matrix/vector (S/D for single/double precision real, C/Z for single/double precision complex), MM is the matrix type (GE for general – eg, just a dense matrix you can’t say anything else about; GB for a general banded matrix, SY for symmetric matrices, etc), and OO is the operation.

    This all seems slightly ridiculous now, but it worked and works relatively well — you quickly learn to scan these for familiar operations so that SGEMV is a single-precision general-matrix times vector multiplication (which is probably what you want, not SGBMV), DGEMM is double-precision matrix-matrix multiply, etc. But it does take some practice.

    So if you look at the cublas sgemv instructions, or in the documentation of the original, you can step through the argument list. First, the basic operation is

    This function performs the matrix-vector multiplication
    y = a op(A)x + b y
    where A is a m x n matrix stored in column-major format, x and y
    are vectors, and and are scalars.

    where op(A) can be A, AT, or AH. So if you just want y = Ax, as is the common case, then a = 1, b = 0. and transa == CUBLAS_OP_N.

    incx is the stride between different elements in x; there’s lots of situations where this would come in handy, but if x is just a simple 1d array containing the vector, then the stride would be 1.

    And that’s about all you need for SGEMV.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I recently wanted use regex in Cocoa app. But I found that Cocoa does
I recently found an issue whereby I wanted to use the sp_MSforeachtable stored proc
I recently learnt Python. I liked it. I just wanted to use it for
I wanted to list all my Recently Used Items. I use this code: public
I recently wanted to use boost::algorithm::join but I couldn't find any usage examples and
So I recently wanted to use the jQuery to do some browser detection and
I recently learned about the DI frameworks Guice and Ninject and wanted to use
I recently ran into a situation where I wanted to use a modeless dialog
I've recently found swingx and wanted to experiment with JXCollapsiblePane. But i cant get
I recently started a small project in which I wanted to use zxing. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.