python手势识别_python手势识别资源-CSDN文库

共2000个文件

png：4032个

py：4个

xml：3个

手势识别

2星需积分: 38 56 浏览量 2018-06-14 11:40:48 上传评论 6 收藏 18.19MB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

python 手势识别（2000个子文件）

Thumbs.db 24KB

.DS_Store 6KB

._.DS_Store 120B

.gitattributes 43B

.gitignore 1KB

CNNGestureRecognizer-master.iml 398B

LICENSE 1KB

README.md 11KB

img_7_layer2_Conv2D.png 349KB

img_3_layer2_Conv2D.png 203KB

img_4_layer4_MaxPooling2D.png 182KB

img_3_layer3_Activation.png 166KB

img_3_layer4_MaxPooling2D.png 166KB

img_3_layer5_Dropout.png 166KB

new_model.png 126KB

img_3_layer1_Activation.png 109KB

ori_4015imgs_loss.png 36KB

ori_4015imgs_acc.png 34KB

model.png 17KB

stop110.png 11KB

stop109.png 11KB

stop107.png 11KB

stop108.png 11KB

stop106.png 11KB

stop111.png 11KB

stop64.png 11KB

stop105.png 11KB

stop86.png 11KB

stop84.png 11KB

stop67.png 11KB

stop63.png 11KB

stop65.png 11KB

stop112.png 11KB

stop85.png 11KB

stop68.png 11KB

stop104.png 11KB

stop87.png 11KB

stop62.png 10KB

stop82.png 10KB

stop88.png 10KB

stop78.png 10KB

stop83.png 10KB

stop79.png 10KB

stop75.png 10KB

stop61.png 10KB

stop81.png 10KB

stop72.png 10KB

stop89.png 10KB

stop66.png 10KB

stop76.png 10KB

stop80.png 10KB

stop103.png 10KB

sstop33.png 10KB

sstop34.png 10KB

sstop60.png 10KB

stop77.png 10KB

stop102.png 10KB

stop71.png 10KB

sstop54.png 10KB

stop100.png 10KB

stop90.png 10KB

sstop35.png 10KB

stop101.png 10KB

stop69.png 10KB

sstop71.png 10KB

stop113.png 10KB

stop91.png 10KB

stop73.png 10KB

stop165.png 10KB

stop173.png 10KB

stop60.png 10KB

stop59.png 10KB

stop74.png 10KB

stop92.png 10KB

sstop68.png 10KB

stop170.png 10KB

sstop62.png 10KB

sstop55.png 10KB

stop163.png 10KB

sstop36.png 10KB

stop164.png 10KB

sstop69.png 10KB

stop167.png 10KB

stop161.png 10KB

stop70.png 10KB

stop169.png 10KB

iiok2.png 10KB

stop166.png 10KB

sstop65.png 10KB

sstop61.png 10KB

sstop72.png 10KB

sstop64.png 10KB

stop93.png 10KB

stop174.png 10KB

sstop74.png 10KB

stop162.png 10KB

stop159.png 10KB

sstop73.png 10KB

sstop37.png 10KB

stop175.png 10KB

共 2000 条

[![DOI](https://zenodo.org/badge/89872749.svg)](https://zenodo.org/badge/latestdoi/89872749) # CNNGestureRecognizer Gesture recognition via CNN neural network implemented in Keras + Theano + OpenCV Key Requirements: Python 2.7.13 OpenCV 2.4.8 Keras 2.0.2 Theano 0.9.0 Suggestion: Better to download Anaconda as it will take care of most of the other packages and easier to setup a virtual workspace to work with multiple versions of key packages like python, opencv etc. # Repo contents - **trackgesture.py** : The main script launcher. This file contains all the code for UI options and OpenCV code to capture camera contents. This script internally calls interfaces to gestureCNN.py. - **gestureCNN.py** : This script file holds all the CNN specific code to create CNN model, load the weight file (if model is pretrained), train the model using image samples present in **./imgfolder_b**, visualize the feature maps at different layers of NN (of pretrained model) for a given input image present in **./imgs** folder. - **imgfolder_b** : This folder contains all the 4015 gesture images I took in order to train the model. - **_ori_4015imgs_weights.hdf5_** : This is pretrained file. If for some reason you find issues with downloading from github then it can be downloaded from my google driver link - https://drive.google.com/open?id=0B6cMRAuImU69SHNCcXpkT3RpYkE - **_imgs_** - This is an optional folder of few sample images that one can use to visualize the feature maps at different layers. These are few sample images from imgfolder_b only. - **_ori_4015imgs_acc.png_** : This is just a pic of a plot depicting model accuracy Vs validation data accuracy after I trained it. - **_ori_4015imgs_loss.png_** : This is just a pic of a plot depicting model loss Vs validation loss after I training. # Usage ```bash $ KERAS_BACKEND=theano python trackgesture.py ``` We are setting KERAS_BACKEND to change backend to Theano, so in case you have already done it via Keras.json then no need to do that. But if you have Tensorflow set as default then this will be required. # Features This application comes with CNN model to recognize upto 5 pretrained gestures: - OK - PEACE - STOP - PUNCH - NOTHING (ie when none of the above gestures are input) This application provides following functionalities: - Prediction : Which allows the app to guess the user's gesture against pretrained gestures. App can dump the prediction data to the console terminal or to a json file directly which can be used to plot real time prediction bar chart (you can use my other script - https://github.com/asingh33/LivePlot) - New Training : Which allows the user to retrain the NN model. User can change the model architecture or add/remove new gestures. This app has inbuilt options to allow the user to create new image samples of user defined gestures if required. - Visualization : Which allows the user to see feature maps of different NN layers for a given input gesture image. Interesting to see how NN works and learns things. # Demo Youtube link - https://www.youtube.com/watch?v=CMs5cn65YK8 ![](https://j.gifs.com/X6zwYm.gif) # Gesture Input I am using OpenCV for capturing the user's hand gestures. In order to simply things I am doing post processing on the captured images to highlight the contours & edges. Like applying binary threshold, blurring, gray scaling. I have provided two modes of capturing: - Binary Mode : In here I first convert the image to grayscale, then apply a gaussian blur effect with adaptive threshold filter. This mode is useful when you have an empty background like a wall, whiteboard etc. - SkinMask Mode : In this mode, I first convert the input image to HSV and put range on the H,S,V values based on skin color range. Then apply errosion followed by dilation. Then gaussian blur to smoothen out the noises. Using this output as a mask on original input to mask out everything other than skin colored things. Finally I have grayscaled it. This mode is useful when there is good amount of light and you dont have empty background. **Binary Mode processing** ```python gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY) blur = cv2.GaussianBlur(gray,(5,5),2) th3 = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY_INV,11,2) ret, res = cv2.threshold(th3, minValue, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU) ``` ![OK gesture in Binary mode](https://github.com/asingh33/CNNGestureRecognizer/blob/master/imgfolder_b/iiiok160.png) **SkindMask Mode processing** ```python hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV) #Apply skin color range mask = cv2.inRange(hsv, low_range, upper_range) mask = cv2.erode(mask, skinkernel, iterations = 1) mask = cv2.dilate(mask, skinkernel, iterations = 1) #blur mask = cv2.GaussianBlur(mask, (15,15), 1) #cv2.imshow("Blur", mask) #bitwise and mask original frame res = cv2.bitwise_and(roi, roi, mask = mask) # color to grayscale res = cv2.cvtColor(res, cv2.COLOR_BGR2GRAY) ``` ![OK gesture in SkinMask mode](https://github.com/asingh33/CNNGestureRecognizer/blob/master/imgfolder_b/iiok44.png) # CNN Model used The CNN I have used for this project is pretty common CNN model which can be found across various tutorials on CNN. Mostly I have seen it being used for Digit/Number classfication based on MNIST database. ```python model = Sequential() model.add(Conv2D(nb_filters, (nb_conv, nb_conv), padding='valid', input_shape=(img_channels, img_rows, img_cols))) convout1 = Activation('relu') model.add(convout1) model.add(Conv2D(nb_filters, (nb_conv, nb_conv))) convout2 = Activation('relu') model.add(convout2) model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool))) model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(128)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(nb_classes)) model.add(Activation('softmax')) ``` This model has following 12 layers - ``` _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 32, 198, 198) 320 _________________________________________________________________ activation_1 (Activation) (None, 32, 198, 198) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 32, 196, 196) 9248 _________________________________________________________________ activation_2 (Activation) (None, 32, 196, 196) 0 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 32, 98, 98) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 32, 98, 98) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 307328) 0 _________________________________________________________________ dense_1 (Dense) (None, 128) 39338112 _________________________________________________________________ activation_3 (Activation) (None, 128) 0 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_2 (Dense) (None, 5) 645 _________________________________________________________________ activation_4 (Activation) (None, 5) 0 ================================================================= ``` Total params: 39,348,325.0 Trainable params: 39,348,325.0 # Training In version 1.0 of this project I had used 1204 images only for training. Predictions probability was ok but not satisfying. So in version 2.0 I increased the training image set to 4015 images i.e. 803 ima

评论收藏

内容反馈