[Rcppdevel] iterator for sparse Matrix
Saurabh B
saurabh.writes at gmail.com
Thu Jan 23 00:04:25 CET 2014
Hi there,
I am new to RcppArmadillo and very excited about the sparse matrix
functionality it offers. I have a large optim that I wish to break down
into stochastic descent in Rcpp to speed things up.
To start small, I used a simple logisitic regression function that I have
working in R 
Norm < function(w1, w2){
sqrt(sum((w2  w1)^2))
}
xentropy < function(x, y, w) {
 ((y * x) / (1 + exp(y * t(w) %*% x)) )
}
stochDesc < function(param, eta = 0.01, maxit = 10, reltol = .01, x, y) {
# Initialize
w = param
iter = 0
rel_err = reltol + 1
x_dash = cbind(rep(1, nrow(x)), x) # Add bias
while(iter < maxit & rel_err > reltol ) {
iter = iter + 1
prev_w < w
for(curidx in sample(nrow(x))) { # Random shuffle
grad < xentropy(x_dash[curidx, ], y[curidx], w)
w < w  eta * grad # Update weights
}
rel_err = Norm(w, prev_w)
# could also use:
# rel_err = base::norm(matrix(prev_w  w, nrow = 1), type = "F")
cat(paste("Epoch ", iter, " Relative cost:", rel_err, "\n"))
}
list(w = w, iter = iter)
}
This does every operation that my more complex gradient descent will do,
but it is pretty straightforward so I started converting it to Rcpp 
#include <RcppArmadillo.h>
#include <iostream>
using namespace std;
using namespace Rcpp ;
// [[Rcpp::depends(RcppArmadillo)]]
double Norm(NumericVector w1, NumericVector w2) {
// Compute the Forbius Norm
double op = pow(sum(pow(w2  w1, 2)), .5);
return(op);
}
NumericVector xentropy(NumericVector x, NumericVector y, NumericVector w) {
// Return the cross entropy
// NumericVector op =  ((y * x) / (1 + exp(y * trans(w) %*% x)) ); //
How to transpose?
NumericVector op = w; // Not correct
return (op);
}
// [[Rcpp::export]]
NumericVector stochDescCpp(NumericVector param, NumericMatrix x,
NumericVector y,
double eta = 0.01, int maxit = 10, double reltol
= .01) {
// Run stochastic Descent to compute logistic regression
// Initialize
NumericVector w (param);
NumericVector prev_w (w);
int epoch = 0;
NumericVector grad (param.length());
double rel_err = reltol + 1;
// Don't know how to add bias, will do that in R before passing it here.
//NumericMatrix x_dash(x.nrow(), x.ncol() + 1, rep(1, x.nrow()),
x.begin());
// Loop through for each epoch
while(epoch < maxit && rel_err > reltol) {
epoch++;
prev_w = w;
for(int curidx = 0; curidx < x.nrow(); ++curidx) { // How to random
shuffle?
grad = xentropy(x(curidx, _), y[curidx], w);
w = w  eta * grad; // Update weights
}
rel_err = Norm(w, prev_w);
cout << "Epoch " << epoch << " Relative cost:" << rel_err << endl;
}
return(w);
}
I am running into many issues. Here they are ranked in order of importance)

1) When I run this (incorrect) program 
stochDescCpp(param = rep(0,3), x = X, y = Y, maxit = 100)
I get 
Error in .Primitive(".Call")(<pointer: 0x0000000069d42560>, param, x, :
negative length vectors are not allowed
All the inputs are valid and work fine with the R version. So I don't know
where this is coming from? I put a bunch of couts and I know that it is
happening inside the for loop but don't know why. In general, what is the
best way to debug code written this way? I am using RStudio.
2) I don't know how to transpose the matrix in rCpp. I read that I could
that using rCppArmadillo, but then I don't know how to convert from
NumericMatrix to arma matrix. I don't want to copy as this operation will
happen 1000x of times, so this has to be fast.
3) I don't know how to randomly shuffle the rows. I tried RcppArmadillo::sample
but haven't gotten the output to work. E.g. do I access elements as
x(rowOrd[curIdx],_)?
4) (minor issue) How do I add the bias unit? In other words, add a new
column to the input matrix filled with all 1s. Do I have to create a new
matrix, copy it column by column? this is not a dealbreaker since I can do
this in R before I send in the inputs.
Sorry if my questions are too basic and don't seem well researched. I am
trying to go for a practical example.
Any help will be greatly appreciated.
Sincerely,
Saurabh
 next part 
An HTML attachment was scrubbed...
URL: <http://lists.rforge.rproject.org/pipermail/rcppdevel/attachments/20140122/4335a454/attachment.html>
More information about the Rcppdevel
mailing list