I am attempting to implement a Naive Bayes classifier using BNT and MATLAB. So

Question

0

Asked: June 5, 20262026-06-05T08:11:24+00:00 2026-06-05T08:11:24+00:00

I am attempting to implement a Naive Bayes classifier using BNT and MATLAB. So

0

I am attempting to implement a Naive Bayes classifier using BNT and MATLAB. So far I have been sticking with simple tabular_CPD variables and “guesstimating” probabilities for the variables. My prototype net so far consists of the following:

DAG = false(5);
DAG(1, 2:5) = true;
bnet = mk_bnet(DAG, [2 3 4 3 3]);
bnet.CPD{1} = tabular_CPD(bnet, 1, [.5  .5]);
bnet.CPD{2} = tabular_CPD(bnet, 2, [.1  .345   .45 .355   .45 .3]);
bnet.CPD{3} = tabular_CPD(bnet, 3, [.2  .02    .59 .2     .2  .39    .01 .39]);
bnet.CPD{4} = tabular_CPD(bnet, 4, [.4  .33333 .5  .33333 .1  .33333]);
bnet.CPD{5} = tabular_CPD(bnet, 5, [.5  .33333 .4  .33333 .1  .33333]);
engine = jtree_inf_engine(bnet);

Here variable 1 is my desired output variable, set to initially assign a .5 probability to either output class.

Variables 2-5 define CPDs for features I measure:

2 is a cluster size, ranging from 1 to a dozen or more
3 is a ratio that will be a real value >= 1
4 and 5 are standard deviation (real) values (X and Y scatter)

In order to classify a candidate cluster I break all of the feature measurements into 3-4 range brackets, like so:

...
    evidence = cell(1, 5);
    evidence{2} = sum(M > [0 2 6]);
    evidence{3} = sum(O > [0 1.57 2 3]);
    evidence{4} = sum(S(1) > [-Inf 1 2]);
    evidence{5} = sum(S(2) > [-Inf 0.4 0.8]);
    eng = enter_evidence(engine, evidence);
    marginals = marginal_nodes(eng, 1);
    e = marginals.T(1);
...

This actually works pretty well, considering I’m only guessing at range brackets and probability values. But I believe that what I should be using here is a gaussian_CPD. I think that a gaussian_CPD can learn both the optimal brackets and probabilities (as mean and covariance matrices and weights).

My problem is, I am not finding any simple examples of how the BNT gaussian_CPD class is used. How, for example, would I go about initializing a gaussian_CPD to approximately the same behavior as one of my tabular_CPD variables above?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T08:11:25+00:00

I eventually figured this out by experimenting with BNT at the MATLAB command prompt. Here is how I defined my classifier net using gaussian_CPD nodes:

DAG = false(5); DAG(1, 2:5) = true
bnet = mk_bnet(DAG, [2 1 1 2 1], 'discrete', 1);
bnet.CPD{1} = tabular_CPD(bnet, 1, 'prior_type', 'dirichlet');
for node = 2:5
   bnet.CPD{node} = gaussian_CPD(bnet, node);
end
bnet

DAG =

     0     1     1     1     1
     0     0     0     0     0
     0     0     0     0     0
     0     0     0     0     0
     0     0     0     0     0

bnet = 

               equiv_class: [1 2 3 4 5]
                    dnodes: 1
                  observed: []
                     names: {}
                    hidden: [1 2 3 4 5]
               hidden_bitv: [1 1 1 1 1]
                       dag: [5x5 logical]
                node_sizes: [2 1 1 2 1]
                    cnodes: [2 3 4 5]
                   parents: {[1x0 double]  [1]  [1]  [1]  [1]}
    members_of_equiv_class: {[1]  [2]  [3]  [4]  [5]}
                       CPD: {[1x1 tabular_CPD]  [1x1 gaussian_CPD]  [1x1 gaussian_CPD]  [1x1 gaussian_CPD]  [1x1 gaussian_CPD]}
             rep_of_eclass: [1 2 3 4 5]
                     order: [1 5 4 3 2]

To train it, I used my original classifier to help me label a set of 300 samples, then simply ran 2/3rds of them through the training algorithm.

bnet = learn_params(bnet, lsamples);
CPD = struct(bnet.CPD{1}); % Peek inside CPD{1}
dispcpt(CPD.CPT);

1 : 0.6045 
2 : 0.3955

The output from dispcpt gives a rough idea of the breakdown between class assignments in the labeled samples in the training set.

To test the new classifier I ran the last 1/3rd of the results through both the original and the new Bayes nets. Here is the code I used for the new net:

engine = jtree_inf_engine(bnet);
evidence = cell(1, 5);
tresults = cell(3, length(tsamples));
tresults(3, :) = tsamples(1, :);
for i = 1:length(tsamples)
    evidence(2:5) = tsamples(2:5, i);
    marginal = marginal_nodes(enter_evidence(engine, evidence), 1);
    tresults{1, i} = find(marginal.T == max(marginal.T)); % Generic decision point
    tresults{2, i} = marginal.T(1);
end
tresults(:, 1:8)

ans = 

    [         2]    [     1]    [         2]    [         2]    [         2]    [     1]    [     1]    [     1]
    [1.8437e-10]    [0.9982]    [3.3710e-05]    [3.8349e-04]    [2.2995e-11]    [0.9997]    [0.9987]    [0.5116]
    [         2]    [     1]    [         2]    [         2]    [         2]    [     1]    [     1]    [     2]

Then to figure out if there was any improvement I plotted overlaid ROC diagrams. As it turned out, my original net did well enough that it was hard to tell for certain if the trained net using Gaussian CPDs did any better. Printing the areas under the ROC curves clarified that the new net did indeed perform slightly better. (base area is the original net, and area is the new one.)

conf = cell2mat(tresults(2,:));
hit = cell2mat(tresults(3,:)) == 1;
[~, ~, basearea] = plotROC(baseconf, basehit, 'r')
hold all;
[~, ~, area] = plotROC(conf, hit, 'b')
hold off;

basearea =

    0.9371

area =

    0.9555

ROC Diagram

I’m posting this here so that next time I need to do this, I’ll be able to find an answer… Hopefully someone else might find it useful as well.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am attempting to implement a Naive Bayes classifier using BNT and MATLAB. So

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply